diff mbox series

[6/7] drm: Document fdinfo format specification

Message ID 20220106165536.57208-7-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Per client GPU stats | expand

Commit Message

Tvrtko Ursulin Jan. 6, 2022, 4:55 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Proposal to standardise the fdinfo text format as optionally output by DRM
drivers.

Idea is that a simple but, well defined, spec will enable generic
userspace tools to be written while at the same time avoiding a more heavy
handed approach of adding a mid-layer to DRM.

i915 implements a subset of the spec, everything apart from the memory
stats currently, and a matching intel_gpu_top tool exists.

Open is to see if AMD can migrate to using the proposed GPU utilisation
key-value pairs, or if they are not workable to see whether to go
vendor specific, or if a standardised  alternative can be found which is
workable for both drivers.

Same for the memory utilisation key-value pairs proposal.

v2:
 * Update for removal of name and pid.

v3:
 * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David M Nieto <David.Nieto@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Stone <daniel@fooishbar.org>
Cc: Chris Healy <cphealy@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
 Documentation/gpu/drm-usage-stats.rst | 97 +++++++++++++++++++++++++++
 Documentation/gpu/index.rst           |  1 +
 2 files changed, 98 insertions(+)
 create mode 100644 Documentation/gpu/drm-usage-stats.rst

Comments

Daniel Vetter Jan. 19, 2022, 3:08 p.m. UTC | #1
On Thu, Jan 06, 2022 at 04:55:35PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Proposal to standardise the fdinfo text format as optionally output by DRM
> drivers.
> 
> Idea is that a simple but, well defined, spec will enable generic
> userspace tools to be written while at the same time avoiding a more heavy
> handed approach of adding a mid-layer to DRM.
> 
> i915 implements a subset of the spec, everything apart from the memory
> stats currently, and a matching intel_gpu_top tool exists.
> 
> Open is to see if AMD can migrate to using the proposed GPU utilisation
> key-value pairs, or if they are not workable to see whether to go
> vendor specific, or if a standardised  alternative can be found which is
> workable for both drivers.
> 
> Same for the memory utilisation key-value pairs proposal.
> 
> v2:
>  * Update for removal of name and pid.
> 
> v3:
>  * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: David M Nieto <David.Nieto@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Daniel Stone <daniel@fooishbar.org>
> Cc: Chris Healy <cphealy@gmail.com>
> Acked-by: Christian König <christian.koenig@amd.com>

I'm assuming this ack here and later on is a "amdgpu plans to use this
too" kind of affair. Especially also in the lights of eventually using
matching semantics for cgroups and everything else tied to gpu execution
resource management.

If not I'm mildly worried that we're creating fake-standard stuff here
which cannot actually be used by anything resembling driver-agnostic
userspace.
-Daniel

> ---
>  Documentation/gpu/drm-usage-stats.rst | 97 +++++++++++++++++++++++++++
>  Documentation/gpu/index.rst           |  1 +
>  2 files changed, 98 insertions(+)
>  create mode 100644 Documentation/gpu/drm-usage-stats.rst
> 
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> new file mode 100644
> index 000000000000..c669026be244
> --- /dev/null
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -0,0 +1,97 @@
> +.. _drm-client-usage-stats:
> +
> +======================
> +DRM client usage stats
> +======================
> +
> +DRM drivers can choose to export partly standardised text output via the
> +`fops->show_fdinfo()` as part of the driver specific file operations registered
> +in the `struct drm_driver` object registered with the DRM core.
> +
> +One purpose of this output is to enable writing as generic as practicaly
> +feasible `top(1)` like userspace monitoring tools.
> +
> +Given the differences between various DRM drivers the specification of the
> +output is split between common and driver specific parts. Having said that,
> +wherever possible effort should still be made to standardise as much as
> +possible.
> +
> +File format specification
> +=========================
> +
> +- File shall contain one key value pair per one line of text.
> +- Colon character (`:`) must be used to delimit keys and values.
> +- All keys shall be prefixed with `drm-`.
> +- Whitespace between the delimiter and first non-whitespace character shall be
> +  ignored when parsing.
> +- Neither keys or values are allowed to contain whitespace characters.
> +- Numerical key value pairs can end with optional unit string.
> +- Data type of the value is fixed as defined in the specification.
> +
> +Key types
> +---------
> +
> +1. Mandatory, fully standardised.
> +2. Optional, fully standardised.
> +3. Driver specific.
> +
> +Data types
> +----------
> +
> +- <uint> - Unsigned integer without defining the maximum value.
> +- <str> - String excluding any above defined reserved characters or whitespace.
> +
> +Mandatory fully standardised keys
> +---------------------------------
> +
> +- drm-driver: <str>
> +
> +String shall contain the name this driver registered as via the respective
> +`struct drm_driver` data structure.
> +
> +Optional fully standardised keys
> +--------------------------------
> +
> +- drm-pdev: <aaaa:bb.cc.d>
> +
> +For PCI devices this should contain the PCI slot address of the device in
> +question.
> +
> +- drm-client-id: <uint>
> +
> +Unique value relating to the open DRM file descriptor used to distinguish
> +duplicated and shared file descriptors. Conceptually the value should map 1:1
> +to the in kernel representation of `struct drm_file` instances.
> +
> +Uniqueness of the value shall be either globally unique, or unique within the
> +scope of each device, in which case `drm-pdev` shall be present as well.
> +
> +Userspace should make sure to not double account any usage statistics by using
> +the above described criteria in order to associate data to individual clients.
> +
> +- drm-engine-<str>: <uint> ns
> +
> +GPUs usually contain multiple execution engines. Each shall be given a stable
> +and unique name (str), with possible values documented in the driver specific
> +documentation.
> +
> +Value shall be in specified time units which the respective GPU engine spent
> +busy executing workloads belonging to this client.
> +
> +Values are not required to be constantly monotonic if it makes the driver
> +implementation easier, but are required to catch up with the previously reported
> +larger value within a reasonable period. Upon observing a value lower than what
> +was previously read, userspace is expected to stay with that larger previous
> +value until a monotonic update is seen.
> +
> +- drm-memory-<str>: <uint> [KiB|MiB]
> +
> +Each possible memory type which can be used to store buffer objects by the
> +GPU in question shall be given a stable and unique name to be returned as the
> +string here.
> +
> +Value shall reflect the amount of storage currently consumed by the buffer
> +object belong to this client, in the respective memory region.
> +
> +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> +indicating kibi- or mebi-bytes.
> diff --git a/Documentation/gpu/index.rst b/Documentation/gpu/index.rst
> index b9c1214d8f23..b99dede9a5b1 100644
> --- a/Documentation/gpu/index.rst
> +++ b/Documentation/gpu/index.rst
> @@ -10,6 +10,7 @@ Linux GPU Driver Developer's Guide
>     drm-kms
>     drm-kms-helpers
>     drm-uapi
> +   drm-usage-stats
>     driver-uapi
>     drm-client
>     drivers
> -- 
> 2.32.0
>
Tvrtko Ursulin Jan. 20, 2022, 10:30 a.m. UTC | #2
On 19/01/2022 15:08, Daniel Vetter wrote:
> On Thu, Jan 06, 2022 at 04:55:35PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Proposal to standardise the fdinfo text format as optionally output by DRM
>> drivers.
>>
>> Idea is that a simple but, well defined, spec will enable generic
>> userspace tools to be written while at the same time avoiding a more heavy
>> handed approach of adding a mid-layer to DRM.
>>
>> i915 implements a subset of the spec, everything apart from the memory
>> stats currently, and a matching intel_gpu_top tool exists.
>>
>> Open is to see if AMD can migrate to using the proposed GPU utilisation
>> key-value pairs, or if they are not workable to see whether to go
>> vendor specific, or if a standardised  alternative can be found which is
>> workable for both drivers.
>>
>> Same for the memory utilisation key-value pairs proposal.
>>
>> v2:
>>   * Update for removal of name and pid.
>>
>> v3:
>>   * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: David M Nieto <David.Nieto@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Daniel Stone <daniel@fooishbar.org>
>> Cc: Chris Healy <cphealy@gmail.com>
>> Acked-by: Christian König <christian.koenig@amd.com>
> 
> I'm assuming this ack here and later on is a "amdgpu plans to use this
> too" kind of affair. Especially also in the lights of eventually using
> matching semantics for cgroups and everything else tied to gpu execution
> resource management.
> 
> If not I'm mildly worried that we're creating fake-standard stuff here
> which cannot actually be used by anything resembling driver-agnostic
> userspace.

Hard to say how much adoption there would be.

At least on the statement of that the proposed spec cannot be used for 
driver agnostic userspace, do you have concrete concerns with the spec I 
proposed, or are just going by the lack of continuous engagement by any 
third party?

Apart from AMD, during past postings Daniel Stone also had positive 
feedback (along the lines of "works the driver I am familiar with"). I 
don't know if I have missed someone else who provided feedback, hope not.

There is of course the option of dropping the idea of trying to document 
a common spec, or to do anything cross-driver at this point. AFAIR it 
was your push to try this, and I agreed it would be a good thing if it 
worked out. But given AMD already exposes stuff in fdinfo, I don't think 
it would be a blocker for merging the i915 side even if we decided to 
drop the standardisation effort for now. Given I am maintaining this 
i915 code from ~2018 and there is a lot of interest from users it would 
be good to put it in.

Regards,

Tvrtko
Rob Clark Jan. 20, 2022, 4:44 p.m. UTC | #3
On Wed, Jan 19, 2022 at 7:09 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Jan 06, 2022 at 04:55:35PM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> > Proposal to standardise the fdinfo text format as optionally output by DRM
> > drivers.
> >
> > Idea is that a simple but, well defined, spec will enable generic
> > userspace tools to be written while at the same time avoiding a more heavy
> > handed approach of adding a mid-layer to DRM.
> >
> > i915 implements a subset of the spec, everything apart from the memory
> > stats currently, and a matching intel_gpu_top tool exists.
> >
> > Open is to see if AMD can migrate to using the proposed GPU utilisation
> > key-value pairs, or if they are not workable to see whether to go
> > vendor specific, or if a standardised  alternative can be found which is
> > workable for both drivers.
> >
> > Same for the memory utilisation key-value pairs proposal.
> >
> > v2:
> >  * Update for removal of name and pid.
> >
> > v3:
> >  * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)
> >
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: David M Nieto <David.Nieto@amd.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Daniel Stone <daniel@fooishbar.org>
> > Cc: Chris Healy <cphealy@gmail.com>
> > Acked-by: Christian König <christian.koenig@amd.com>
>
> I'm assuming this ack here and later on is a "amdgpu plans to use this
> too" kind of affair. Especially also in the lights of eventually using
> matching semantics for cgroups and everything else tied to gpu execution
> resource management.
>
> If not I'm mildly worried that we're creating fake-standard stuff here
> which cannot actually be used by anything resembling driver-agnostic
> userspace.

I think I could implement something like this for drm/msm.  I am a bit
uncertain about the memory stats (ie. how are we intended to account
for imported/exported/shared bo's)?  But we already track cycles+time
per submit for devfreq, it would be pretty easy to add per drm_file
counters to accumulate the per-submit results.  We could even track
per-context (submitqueue) for processes that have more than a single
context, although not sure if that is useful.

And I think there is probably some room for shared helper to print
parts other than the per-engine stats (and maybe memory stats,
although even that could be a shared implementation for some
drivers).. with a driver callback for the non-generic parts, ie.
something like:

   drm_driver::show_client_stats(struct drm_file *, struct drm_printer *)

but that can come later.

If there is a tool somewhere that displays this info, that would be
useful for testing my implementation.

BR,
-R

> -Daniel
>
> > ---
> >  Documentation/gpu/drm-usage-stats.rst | 97 +++++++++++++++++++++++++++
> >  Documentation/gpu/index.rst           |  1 +
> >  2 files changed, 98 insertions(+)
> >  create mode 100644 Documentation/gpu/drm-usage-stats.rst
> >
> > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> > new file mode 100644
> > index 000000000000..c669026be244
> > --- /dev/null
> > +++ b/Documentation/gpu/drm-usage-stats.rst
> > @@ -0,0 +1,97 @@
> > +.. _drm-client-usage-stats:
> > +
> > +======================
> > +DRM client usage stats
> > +======================
> > +
> > +DRM drivers can choose to export partly standardised text output via the
> > +`fops->show_fdinfo()` as part of the driver specific file operations registered
> > +in the `struct drm_driver` object registered with the DRM core.
> > +
> > +One purpose of this output is to enable writing as generic as practicaly
> > +feasible `top(1)` like userspace monitoring tools.
> > +
> > +Given the differences between various DRM drivers the specification of the
> > +output is split between common and driver specific parts. Having said that,
> > +wherever possible effort should still be made to standardise as much as
> > +possible.
> > +
> > +File format specification
> > +=========================
> > +
> > +- File shall contain one key value pair per one line of text.
> > +- Colon character (`:`) must be used to delimit keys and values.
> > +- All keys shall be prefixed with `drm-`.
> > +- Whitespace between the delimiter and first non-whitespace character shall be
> > +  ignored when parsing.
> > +- Neither keys or values are allowed to contain whitespace characters.
> > +- Numerical key value pairs can end with optional unit string.
> > +- Data type of the value is fixed as defined in the specification.
> > +
> > +Key types
> > +---------
> > +
> > +1. Mandatory, fully standardised.
> > +2. Optional, fully standardised.
> > +3. Driver specific.
> > +
> > +Data types
> > +----------
> > +
> > +- <uint> - Unsigned integer without defining the maximum value.
> > +- <str> - String excluding any above defined reserved characters or whitespace.
> > +
> > +Mandatory fully standardised keys
> > +---------------------------------
> > +
> > +- drm-driver: <str>
> > +
> > +String shall contain the name this driver registered as via the respective
> > +`struct drm_driver` data structure.
> > +
> > +Optional fully standardised keys
> > +--------------------------------
> > +
> > +- drm-pdev: <aaaa:bb.cc.d>
> > +
> > +For PCI devices this should contain the PCI slot address of the device in
> > +question.
> > +
> > +- drm-client-id: <uint>
> > +
> > +Unique value relating to the open DRM file descriptor used to distinguish
> > +duplicated and shared file descriptors. Conceptually the value should map 1:1
> > +to the in kernel representation of `struct drm_file` instances.
> > +
> > +Uniqueness of the value shall be either globally unique, or unique within the
> > +scope of each device, in which case `drm-pdev` shall be present as well.
> > +
> > +Userspace should make sure to not double account any usage statistics by using
> > +the above described criteria in order to associate data to individual clients.
> > +
> > +- drm-engine-<str>: <uint> ns
> > +
> > +GPUs usually contain multiple execution engines. Each shall be given a stable
> > +and unique name (str), with possible values documented in the driver specific
> > +documentation.
> > +
> > +Value shall be in specified time units which the respective GPU engine spent
> > +busy executing workloads belonging to this client.
> > +
> > +Values are not required to be constantly monotonic if it makes the driver
> > +implementation easier, but are required to catch up with the previously reported
> > +larger value within a reasonable period. Upon observing a value lower than what
> > +was previously read, userspace is expected to stay with that larger previous
> > +value until a monotonic update is seen.
> > +
> > +- drm-memory-<str>: <uint> [KiB|MiB]
> > +
> > +Each possible memory type which can be used to store buffer objects by the
> > +GPU in question shall be given a stable and unique name to be returned as the
> > +string here.
> > +
> > +Value shall reflect the amount of storage currently consumed by the buffer
> > +object belong to this client, in the respective memory region.
> > +
> > +Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
> > +indicating kibi- or mebi-bytes.
> > diff --git a/Documentation/gpu/index.rst b/Documentation/gpu/index.rst
> > index b9c1214d8f23..b99dede9a5b1 100644
> > --- a/Documentation/gpu/index.rst
> > +++ b/Documentation/gpu/index.rst
> > @@ -10,6 +10,7 @@ Linux GPU Driver Developer's Guide
> >     drm-kms
> >     drm-kms-helpers
> >     drm-uapi
> > +   drm-usage-stats
> >     driver-uapi
> >     drm-client
> >     drivers
> > --
> > 2.32.0
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Tvrtko Ursulin Jan. 21, 2022, 11:50 a.m. UTC | #4
On 20/01/2022 16:44, Rob Clark wrote:
> On Wed, Jan 19, 2022 at 7:09 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>
>> On Thu, Jan 06, 2022 at 04:55:35PM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Proposal to standardise the fdinfo text format as optionally output by DRM
>>> drivers.
>>>
>>> Idea is that a simple but, well defined, spec will enable generic
>>> userspace tools to be written while at the same time avoiding a more heavy
>>> handed approach of adding a mid-layer to DRM.
>>>
>>> i915 implements a subset of the spec, everything apart from the memory
>>> stats currently, and a matching intel_gpu_top tool exists.
>>>
>>> Open is to see if AMD can migrate to using the proposed GPU utilisation
>>> key-value pairs, or if they are not workable to see whether to go
>>> vendor specific, or if a standardised  alternative can be found which is
>>> workable for both drivers.
>>>
>>> Same for the memory utilisation key-value pairs proposal.
>>>
>>> v2:
>>>   * Update for removal of name and pid.
>>>
>>> v3:
>>>   * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: David M Nieto <David.Nieto@amd.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Daniel Stone <daniel@fooishbar.org>
>>> Cc: Chris Healy <cphealy@gmail.com>
>>> Acked-by: Christian König <christian.koenig@amd.com>
>>
>> I'm assuming this ack here and later on is a "amdgpu plans to use this
>> too" kind of affair. Especially also in the lights of eventually using
>> matching semantics for cgroups and everything else tied to gpu execution
>> resource management.
>>
>> If not I'm mildly worried that we're creating fake-standard stuff here
>> which cannot actually be used by anything resembling driver-agnostic
>> userspace.
> 
> I think I could implement something like this for drm/msm.  I am a bit
> uncertain about the memory stats (ie. how are we intended to account
> for imported/exported/shared bo's)?  But we already track cycles+time
> per submit for devfreq, it would be pretty easy to add per drm_file
> counters to accumulate the per-submit results.  We could even track
> per-context (submitqueue) for processes that have more than a single
> context, although not sure if that is useful.

Interesting tidbit is that the whole i915 work started from a customer 
request to expose just that (per context) in a form akin to 
getrusage(2). I think this kind of introspection capability is 
interesting but as it is driver specific territory it's only anecdotal 
for what this thread is concerned.

> And I think there is probably some room for shared helper to print
> parts other than the per-engine stats (and maybe memory stats,
> although even that could be a shared implementation for some
> drivers).. with a driver callback for the non-generic parts, ie.
> something like:
> 
>     drm_driver::show_client_stats(struct drm_file *, struct drm_printer *)
> 
> but that can come later.
> 
> If there is a tool somewhere that displays this info, that would be
> useful for testing my implementation.

I have a patch to Intel specific intel_gpu_top (see 
https://patchwork.freedesktop.org/patch/468491/?series=98555&rev=1). 
I'll have a look to see how much work would it be to extract common bits 
into a library and write a quick agnostic tool using it.

Regards,

Tvrtko
Tvrtko Ursulin Jan. 25, 2022, 10:24 a.m. UTC | #5
On 21/01/2022 11:50, Tvrtko Ursulin wrote:
> On 20/01/2022 16:44, Rob Clark wrote:

[snip]

>> If there is a tool somewhere that displays this info, that would be
>> useful for testing my implementation.
> 
> I have a patch to Intel specific intel_gpu_top (see 
> https://patchwork.freedesktop.org/patch/468491/?series=98555&rev=1). 
> I'll have a look to see how much work would it be to extract common bits 
> into a library and write a quick agnostic tool using it.

I factored out some code from intel_gpu_top in a quick and dirty attempt to make it generic and made a very rudimentary tools/gputop:

https://cgit.freedesktop.org/~tursulin/intel-gpu-tools/log/?h=gputop
  
If you manage to export the right fdinfo tags (basically https://patchwork.freedesktop.org/patch/468502/?series=92574&rev=6)*, with the only local addition I have being the optional "drm-engine-capacity-<str>: <uint>" tag, we may get lucky and tool might even work. Let me know when you try. If it will work you should see something like this:

DRM minor 0
    PID              NAME    render       copy       video
   3838          kwin_x11 |█         ||          ||          ||          |
327056               mpv |          ||          ||▌         ||          |
327056               mpv |▌         ||          ||          ||          |
      1           systemd |▍         ||          ||          ||          |
   3884       plasmashell |          ||          ||          ||          |
   4794           krunner |          ||          ||          ||          |
   4836       thunderbird |          ||          ||          ||          |
296733         GeckoMain |          ||          ||          ||          |

Regards,

Tvrtko

*) Or for more reference this is how the i915 output looks like:

$ sudo cat /proc/7296/fdinfo/10
pos:    0
flags:  02100002
mnt_id: 26
ino:    501
drm-driver:     i915
drm-pdev:       0000:00:02.0
drm-client-id:  22
drm-engine-render:      196329331 ns
drm-engine-copy:        0 ns
drm-engine-video:       0 ns
drm-engine-capacity-video:      2
drm-engine-video-enhance:       0 ns

P.S. There is no AMD support in the current code, or nothing for memory either. Both can be added later.
Christian König Jan. 25, 2022, 10:36 a.m. UTC | #6
Wow, nice.

Certainly +1 from my side to have that generalized, documented and 
uniform among all drivers.

Regards,
Christian.

Am 25.01.22 um 11:24 schrieb Tvrtko Ursulin:
>
> On 21/01/2022 11:50, Tvrtko Ursulin wrote:
>> On 20/01/2022 16:44, Rob Clark wrote:
>
> [snip]
>
>>> If there is a tool somewhere that displays this info, that would be
>>> useful for testing my implementation.
>>
>> I have a patch to Intel specific intel_gpu_top (see 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F468491%2F%3Fseries%3D98555%26rev%3D1&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C99e7138493364003d3e908d9dfece57f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637787030821307243%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=X5Qa7rZkyvplRJ5fJwaKjwa%2FtII9EZmg5PIYMOcqKiA%3D&amp;reserved=0). 
>> I'll have a look to see how much work would it be to extract common 
>> bits into a library and write a quick agnostic tool using it.
>
> I factored out some code from intel_gpu_top in a quick and dirty 
> attempt to make it generic and made a very rudimentary tools/gputop:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2F~tursulin%2Fintel-gpu-tools%2Flog%2F%3Fh%3Dgputop&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C99e7138493364003d3e908d9dfece57f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637787030821307243%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=FBhSRN8vyl6zgT8vGZ8kRhWFtt59dZUCHMmeCl9gdnI%3D&amp;reserved=0 
>
>
> If you manage to export the right fdinfo tags (basically 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F468502%2F%3Fseries%3D92574%26rev%3D6&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C99e7138493364003d3e908d9dfece57f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637787030821307243%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=a0aBjo%2B2WyfzSfrjWkHteEuZAmncLbO8Z%2F5ypZKumlU%3D&amp;reserved=0)*, 
> with the only local addition I have being the optional 
> "drm-engine-capacity-<str>: <uint>" tag, we may get lucky and tool 
> might even work. Let me know when you try. If it will work you should 
> see something like this:
>
> DRM minor 0
>    PID              NAME    render       copy       video
>   3838          kwin_x11 |█         ||          || ||          |
> 327056               mpv |          ||          ||▌ ||          |
> 327056               mpv |▌         ||          || ||          |
>      1           systemd |▍         ||          || ||          |
>   3884       plasmashell |          ||          || ||          |
>   4794           krunner |          ||          || ||          |
>   4836       thunderbird |          ||          || ||          |
> 296733         GeckoMain |          ||          || ||          |
>
> Regards,
>
> Tvrtko
>
> *) Or for more reference this is how the i915 output looks like:
>
> $ sudo cat /proc/7296/fdinfo/10
> pos:    0
> flags:  02100002
> mnt_id: 26
> ino:    501
> drm-driver:     i915
> drm-pdev:       0000:00:02.0
> drm-client-id:  22
> drm-engine-render:      196329331 ns
> drm-engine-copy:        0 ns
> drm-engine-video:       0 ns
> drm-engine-capacity-video:      2
> drm-engine-video-enhance:       0 ns
>
> P.S. There is no AMD support in the current code, or nothing for 
> memory either. Both can be added later.
Tvrtko Ursulin Feb. 21, 2022, 11:14 a.m. UTC | #7
Hi Rob,

On 25/01/2022 10:24, Tvrtko Ursulin wrote:
> 
> On 21/01/2022 11:50, Tvrtko Ursulin wrote:
>> On 20/01/2022 16:44, Rob Clark wrote:
> 
> [snip]
> 
>>> If there is a tool somewhere that displays this info, that would be
>>> useful for testing my implementation.
>>
>> I have a patch to Intel specific intel_gpu_top (see 
>> https://patchwork.freedesktop.org/patch/468491/?series=98555&rev=1). 
>> I'll have a look to see how much work would it be to extract common 
>> bits into a library and write a quick agnostic tool using it.
> 
> I factored out some code from intel_gpu_top in a quick and dirty attempt 
> to make it generic and made a very rudimentary tools/gputop:
> 
> https://cgit.freedesktop.org/~tursulin/intel-gpu-tools/log/?h=gputop

Have you managed to spend any time playing with this yet?

The only remaining open was Daniel's mild concern if vendor agnostic 
userspace is possible using the proposed spec. If you managed to wire up 
the compliant exports and gputop tool works I think that concern would 
be settled.

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
new file mode 100644
index 000000000000..c669026be244
--- /dev/null
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -0,0 +1,97 @@ 
+.. _drm-client-usage-stats:
+
+======================
+DRM client usage stats
+======================
+
+DRM drivers can choose to export partly standardised text output via the
+`fops->show_fdinfo()` as part of the driver specific file operations registered
+in the `struct drm_driver` object registered with the DRM core.
+
+One purpose of this output is to enable writing as generic as practicaly
+feasible `top(1)` like userspace monitoring tools.
+
+Given the differences between various DRM drivers the specification of the
+output is split between common and driver specific parts. Having said that,
+wherever possible effort should still be made to standardise as much as
+possible.
+
+File format specification
+=========================
+
+- File shall contain one key value pair per one line of text.
+- Colon character (`:`) must be used to delimit keys and values.
+- All keys shall be prefixed with `drm-`.
+- Whitespace between the delimiter and first non-whitespace character shall be
+  ignored when parsing.
+- Neither keys or values are allowed to contain whitespace characters.
+- Numerical key value pairs can end with optional unit string.
+- Data type of the value is fixed as defined in the specification.
+
+Key types
+---------
+
+1. Mandatory, fully standardised.
+2. Optional, fully standardised.
+3. Driver specific.
+
+Data types
+----------
+
+- <uint> - Unsigned integer without defining the maximum value.
+- <str> - String excluding any above defined reserved characters or whitespace.
+
+Mandatory fully standardised keys
+---------------------------------
+
+- drm-driver: <str>
+
+String shall contain the name this driver registered as via the respective
+`struct drm_driver` data structure.
+
+Optional fully standardised keys
+--------------------------------
+
+- drm-pdev: <aaaa:bb.cc.d>
+
+For PCI devices this should contain the PCI slot address of the device in
+question.
+
+- drm-client-id: <uint>
+
+Unique value relating to the open DRM file descriptor used to distinguish
+duplicated and shared file descriptors. Conceptually the value should map 1:1
+to the in kernel representation of `struct drm_file` instances.
+
+Uniqueness of the value shall be either globally unique, or unique within the
+scope of each device, in which case `drm-pdev` shall be present as well.
+
+Userspace should make sure to not double account any usage statistics by using
+the above described criteria in order to associate data to individual clients.
+
+- drm-engine-<str>: <uint> ns
+
+GPUs usually contain multiple execution engines. Each shall be given a stable
+and unique name (str), with possible values documented in the driver specific
+documentation.
+
+Value shall be in specified time units which the respective GPU engine spent
+busy executing workloads belonging to this client.
+
+Values are not required to be constantly monotonic if it makes the driver
+implementation easier, but are required to catch up with the previously reported
+larger value within a reasonable period. Upon observing a value lower than what
+was previously read, userspace is expected to stay with that larger previous
+value until a monotonic update is seen.
+
+- drm-memory-<str>: <uint> [KiB|MiB]
+
+Each possible memory type which can be used to store buffer objects by the
+GPU in question shall be given a stable and unique name to be returned as the
+string here.
+
+Value shall reflect the amount of storage currently consumed by the buffer
+object belong to this client, in the respective memory region.
+
+Default unit shall be bytes with optional unit specifiers of 'KiB' or 'MiB'
+indicating kibi- or mebi-bytes.
diff --git a/Documentation/gpu/index.rst b/Documentation/gpu/index.rst
index b9c1214d8f23..b99dede9a5b1 100644
--- a/Documentation/gpu/index.rst
+++ b/Documentation/gpu/index.rst
@@ -10,6 +10,7 @@  Linux GPU Driver Developer's Guide
    drm-kms
    drm-kms-helpers
    drm-uapi
+   drm-usage-stats
    driver-uapi
    drm-client
    drivers