mbox series

[v2,0/2] drm: fdinfo memory stats

Message ID 20230410210608.1873968-1-robdclark@gmail.com (mailing list archive)
Headers show
Series drm: fdinfo memory stats | expand

Message

Rob Clark April 10, 2023, 9:06 p.m. UTC
From: Rob Clark <robdclark@chromium.org>

Similar motivation to other similar recent attempt[1].  But with an
attempt to have some shared code for this.  As well as documentation.

It is probably a bit UMA-centric, I guess devices with VRAM might want
some placement stats as well.  But this seems like a reasonable start.

Basic gputop support: https://patchwork.freedesktop.org/series/116236/
And already nvtop support: https://github.com/Syllo/nvtop/pull/204

[1] https://patchwork.freedesktop.org/series/112397/

Rob Clark (2):
  drm: Add fdinfo memory stats
  drm/msm: Add memory stats to fdinfo

 Documentation/gpu/drm-usage-stats.rst | 21 +++++++
 drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
 drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
 drivers/gpu/drm/msm/msm_gpu.c         |  2 -
 include/drm/drm_file.h                | 10 ++++
 5 files changed, 134 insertions(+), 3 deletions(-)

Comments

Rob Clark April 11, 2023, 4:47 p.m. UTC | #1
On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
>
> From: Rob Clark <robdclark@chromium.org>
>
> Similar motivation to other similar recent attempt[1].  But with an
> attempt to have some shared code for this.  As well as documentation.
>
> It is probably a bit UMA-centric, I guess devices with VRAM might want
> some placement stats as well.  But this seems like a reasonable start.
>
> Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> And already nvtop support: https://github.com/Syllo/nvtop/pull/204

On a related topic, I'm wondering if it would make sense to report
some more global things (temp, freq, etc) via fdinfo?  Some of this,
tools like nvtop could get by trawling sysfs or other driver specific
ways.  But maybe it makes sense to have these sort of things reported
in a standardized way (even though they aren't really per-drm_file)

BR,
-R


> [1] https://patchwork.freedesktop.org/series/112397/
>
> Rob Clark (2):
>   drm: Add fdinfo memory stats
>   drm/msm: Add memory stats to fdinfo
>
>  Documentation/gpu/drm-usage-stats.rst | 21 +++++++
>  drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
>  drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
>  drivers/gpu/drm/msm/msm_gpu.c         |  2 -
>  include/drm/drm_file.h                | 10 ++++
>  5 files changed, 134 insertions(+), 3 deletions(-)
>
> --
> 2.39.2
>
Daniel Vetter April 11, 2023, 4:53 p.m. UTC | #2
On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> >
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Similar motivation to other similar recent attempt[1].  But with an
> > attempt to have some shared code for this.  As well as documentation.
> >
> > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > some placement stats as well.  But this seems like a reasonable start.
> >
> > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> 
> On a related topic, I'm wondering if it would make sense to report
> some more global things (temp, freq, etc) via fdinfo?  Some of this,
> tools like nvtop could get by trawling sysfs or other driver specific
> ways.  But maybe it makes sense to have these sort of things reported
> in a standardized way (even though they aren't really per-drm_file)

I think that's a bit much layering violation, we'd essentially have to
reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
be in :-)

What might be needed is better glue to go from the fd or fdinfo to the
right hw device and then crawl around the hwmon in sysfs automatically. I
would not be surprised at all if we really suck on this, probably more
likely on SoC than pci gpus where at least everything should be under the
main pci sysfs device.
-Daniel

> 
> BR,
> -R
> 
> 
> > [1] https://patchwork.freedesktop.org/series/112397/
> >
> > Rob Clark (2):
> >   drm: Add fdinfo memory stats
> >   drm/msm: Add memory stats to fdinfo
> >
> >  Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> >  drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> >  drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> >  include/drm/drm_file.h                | 10 ++++
> >  5 files changed, 134 insertions(+), 3 deletions(-)
> >
> > --
> > 2.39.2
> >
Rob Clark April 11, 2023, 5:13 p.m. UTC | #3
On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > From: Rob Clark <robdclark@chromium.org>
> > >
> > > Similar motivation to other similar recent attempt[1].  But with an
> > > attempt to have some shared code for this.  As well as documentation.
> > >
> > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > some placement stats as well.  But this seems like a reasonable start.
> > >
> > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> >
> > On a related topic, I'm wondering if it would make sense to report
> > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > tools like nvtop could get by trawling sysfs or other driver specific
> > ways.  But maybe it makes sense to have these sort of things reported
> > in a standardized way (even though they aren't really per-drm_file)
>
> I think that's a bit much layering violation, we'd essentially have to
> reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> be in :-)

I guess this is true for temp (where there are thermal zones with
potentially multiple temp sensors.. but I'm still digging my way thru
the thermal_cooling_device stuff)

But what about freq?  I think, esp for cases where some "fw thing" is
controlling the freq we end up needing to use gpu counters to measure
the freq.

> What might be needed is better glue to go from the fd or fdinfo to the
> right hw device and then crawl around the hwmon in sysfs automatically. I
> would not be surprised at all if we really suck on this, probably more
> likely on SoC than pci gpus where at least everything should be under the
> main pci sysfs device.

yeah, I *think* userspace would have to look at /proc/device-tree to
find the cooling device(s) associated with the gpu.. at least I don't
see a straightforward way to figure it out just for sysfs

BR,
-R

> -Daniel
>
> >
> > BR,
> > -R
> >
> >
> > > [1] https://patchwork.freedesktop.org/series/112397/
> > >
> > > Rob Clark (2):
> > >   drm: Add fdinfo memory stats
> > >   drm/msm: Add memory stats to fdinfo
> > >
> > >  Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > >  drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > >  drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > >  drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > >  include/drm/drm_file.h                | 10 ++++
> > >  5 files changed, 134 insertions(+), 3 deletions(-)
> > >
> > > --
> > > 2.39.2
> > >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Dmitry Baryshkov April 11, 2023, 5:35 p.m. UTC | #4
On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
>
> On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > >
> > > > From: Rob Clark <robdclark@chromium.org>
> > > >
> > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > attempt to have some shared code for this.  As well as documentation.
> > > >
> > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > some placement stats as well.  But this seems like a reasonable start.
> > > >
> > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > >
> > > On a related topic, I'm wondering if it would make sense to report
> > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > tools like nvtop could get by trawling sysfs or other driver specific
> > > ways.  But maybe it makes sense to have these sort of things reported
> > > in a standardized way (even though they aren't really per-drm_file)
> >
> > I think that's a bit much layering violation, we'd essentially have to
> > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > be in :-)
>
> I guess this is true for temp (where there are thermal zones with
> potentially multiple temp sensors.. but I'm still digging my way thru
> the thermal_cooling_device stuff)

It is slightly ugly. All thermal zones and cooling devices are virtual
devices (so, even no connection to the particular tsens device). One
can either enumerate them by checking
/sys/class/thermal/thermal_zoneN/type or enumerate them through
/sys/class/hwmon. For cooling devices again the only enumeration is
through /sys/class/thermal/cooling_deviceN/type.

Probably it should be possible to push cooling devices and thermal
zones under corresponding providers. However I do not know if there is
a good way to correlate cooling device (ideally a part of GPU) to the
thermal_zone (which in our case is provided by tsens / temp_alarm
rather than GPU itself).

>
> But what about freq?  I think, esp for cases where some "fw thing" is
> controlling the freq we end up needing to use gpu counters to measure
> the freq.

For the freq it is slightly easier: /sys/class/devfreq/*, devices are
registered under proper parent (IOW, GPU). So one can read
/sys/class/devfreq/3d00000.gpu/cur_freq or
/sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.

However because of the components usage, there is no link from
/sys/class/drm/card0
(/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.

Getting all these items together in a platform-independent way would
be definitely an important but complex topic.

>
> > What might be needed is better glue to go from the fd or fdinfo to the
> > right hw device and then crawl around the hwmon in sysfs automatically. I
> > would not be surprised at all if we really suck on this, probably more
> > likely on SoC than pci gpus where at least everything should be under the
> > main pci sysfs device.
>
> yeah, I *think* userspace would have to look at /proc/device-tree to
> find the cooling device(s) associated with the gpu.. at least I don't
> see a straightforward way to figure it out just for sysfs
>
> BR,
> -R
>
> > -Daniel
> >
> > >
> > > BR,
> > > -R
> > >
> > >
> > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > >
> > > > Rob Clark (2):
> > > >   drm: Add fdinfo memory stats
> > > >   drm/msm: Add memory stats to fdinfo
> > > >
> > > >  Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > >  drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > >  drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > >  drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > >  include/drm/drm_file.h                | 10 ++++
> > > >  5 files changed, 134 insertions(+), 3 deletions(-)
> > > >
> > > > --
> > > > 2.39.2
> > > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
Daniel Vetter April 11, 2023, 6:26 p.m. UTC | #5
On Tue, Apr 11, 2023 at 08:35:48PM +0300, Dmitry Baryshkov wrote:
> On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > >
> > > > > From: Rob Clark <robdclark@chromium.org>
> > > > >
> > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > attempt to have some shared code for this.  As well as documentation.
> > > > >
> > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > >
> > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > >
> > > > On a related topic, I'm wondering if it would make sense to report
> > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > in a standardized way (even though they aren't really per-drm_file)
> > >
> > > I think that's a bit much layering violation, we'd essentially have to
> > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > be in :-)
> >
> > I guess this is true for temp (where there are thermal zones with
> > potentially multiple temp sensors.. but I'm still digging my way thru
> > the thermal_cooling_device stuff)
> 
> It is slightly ugly. All thermal zones and cooling devices are virtual
> devices (so, even no connection to the particular tsens device). One
> can either enumerate them by checking
> /sys/class/thermal/thermal_zoneN/type or enumerate them through
> /sys/class/hwmon. For cooling devices again the only enumeration is
> through /sys/class/thermal/cooling_deviceN/type.
> 
> Probably it should be possible to push cooling devices and thermal
> zones under corresponding providers. However I do not know if there is
> a good way to correlate cooling device (ideally a part of GPU) to the
> thermal_zone (which in our case is provided by tsens / temp_alarm
> rather than GPU itself).

There's not even sysfs links to connect the pieces in both ways?

> > But what about freq?  I think, esp for cases where some "fw thing" is
> > controlling the freq we end up needing to use gpu counters to measure
> > the freq.
> 
> For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> registered under proper parent (IOW, GPU). So one can read
> /sys/class/devfreq/3d00000.gpu/cur_freq or
> /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
> 
> However because of the components usage, there is no link from
> /sys/class/drm/card0
> (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.

Hm ... do we need to make component more visible in sysfs, with _looooots_
of links? Atm it's just not even there.

> Getting all these items together in a platform-independent way would
> be definitely an important but complex topic.

Yeah this sounds like some work. But also sounds like it's all generic
issues (thermal zones above and component here) that really should be
fixed at that level?

Cheers, Daniel


> > > What might be needed is better glue to go from the fd or fdinfo to the
> > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > would not be surprised at all if we really suck on this, probably more
> > > likely on SoC than pci gpus where at least everything should be under the
> > > main pci sysfs device.
> >
> > yeah, I *think* userspace would have to look at /proc/device-tree to
> > find the cooling device(s) associated with the gpu.. at least I don't
> > see a straightforward way to figure it out just for sysfs
> >
> > BR,
> > -R
> >
> > > -Daniel
> > >
> > > >
> > > > BR,
> > > > -R
> > > >
> > > >
> > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > >
> > > > > Rob Clark (2):
> > > > >   drm: Add fdinfo memory stats
> > > > >   drm/msm: Add memory stats to fdinfo
> > > > >
> > > > >  Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > >  drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > >  drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > >  drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > >  include/drm/drm_file.h                | 10 ++++
> > > > >  5 files changed, 134 insertions(+), 3 deletions(-)
> > > > >
> > > > > --
> > > > > 2.39.2
> > > > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> 
> 
> 
> -- 
> With best wishes
> Dmitry
Rob Clark April 11, 2023, 6:28 p.m. UTC | #6
On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > >
> > > > > From: Rob Clark <robdclark@chromium.org>
> > > > >
> > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > attempt to have some shared code for this.  As well as documentation.
> > > > >
> > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > >
> > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > >
> > > > On a related topic, I'm wondering if it would make sense to report
> > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > in a standardized way (even though they aren't really per-drm_file)
> > >
> > > I think that's a bit much layering violation, we'd essentially have to
> > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > be in :-)
> >
> > I guess this is true for temp (where there are thermal zones with
> > potentially multiple temp sensors.. but I'm still digging my way thru
> > the thermal_cooling_device stuff)
>
> It is slightly ugly. All thermal zones and cooling devices are virtual
> devices (so, even no connection to the particular tsens device). One
> can either enumerate them by checking
> /sys/class/thermal/thermal_zoneN/type or enumerate them through
> /sys/class/hwmon. For cooling devices again the only enumeration is
> through /sys/class/thermal/cooling_deviceN/type.
>
> Probably it should be possible to push cooling devices and thermal
> zones under corresponding providers. However I do not know if there is
> a good way to correlate cooling device (ideally a part of GPU) to the
> thermal_zone (which in our case is provided by tsens / temp_alarm
> rather than GPU itself).
>
> >
> > But what about freq?  I think, esp for cases where some "fw thing" is
> > controlling the freq we end up needing to use gpu counters to measure
> > the freq.
>
> For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> registered under proper parent (IOW, GPU). So one can read
> /sys/class/devfreq/3d00000.gpu/cur_freq or
> /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
>
> However because of the components usage, there is no link from
> /sys/class/drm/card0
> (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
>
> Getting all these items together in a platform-independent way would
> be definitely an important but complex topic.

But I don't believe any of the pci gpu's use devfreq ;-)

And also, you can't expect the CPU to actually know the freq when fw
is the one controlling freq.  We can, currently, have a reasonable
approximation from devfreq but that stops if IFPC is implemented.  And
other GPUs have even less direct control.  So freq is a thing that I
don't think we should try to get from "common frameworks"

BR,
-R

> >
> > > What might be needed is better glue to go from the fd or fdinfo to the
> > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > would not be surprised at all if we really suck on this, probably more
> > > likely on SoC than pci gpus where at least everything should be under the
> > > main pci sysfs device.
> >
> > yeah, I *think* userspace would have to look at /proc/device-tree to
> > find the cooling device(s) associated with the gpu.. at least I don't
> > see a straightforward way to figure it out just for sysfs
> >
> > BR,
> > -R
> >
> > > -Daniel
> > >
> > > >
> > > > BR,
> > > > -R
> > > >
> > > >
> > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > >
> > > > > Rob Clark (2):
> > > > >   drm: Add fdinfo memory stats
> > > > >   drm/msm: Add memory stats to fdinfo
> > > > >
> > > > >  Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > >  drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > >  drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > >  drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > >  include/drm/drm_file.h                | 10 ++++
> > > > >  5 files changed, 134 insertions(+), 3 deletions(-)
> > > > >
> > > > > --
> > > > > 2.39.2
> > > > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
>
>
> --
> With best wishes
> Dmitry
Dmitry Baryshkov April 11, 2023, 10:27 p.m. UTC | #7
On 11/04/2023 21:26, Daniel Vetter wrote:
> On Tue, Apr 11, 2023 at 08:35:48PM +0300, Dmitry Baryshkov wrote:
>> On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
>>>
>>> On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>
>>>> On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
>>>>> On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
>>>>>>
>>>>>> From: Rob Clark <robdclark@chromium.org>
>>>>>>
>>>>>> Similar motivation to other similar recent attempt[1].  But with an
>>>>>> attempt to have some shared code for this.  As well as documentation.
>>>>>>
>>>>>> It is probably a bit UMA-centric, I guess devices with VRAM might want
>>>>>> some placement stats as well.  But this seems like a reasonable start.
>>>>>>
>>>>>> Basic gputop support: https://patchwork.freedesktop.org/series/116236/
>>>>>> And already nvtop support: https://github.com/Syllo/nvtop/pull/204
>>>>>
>>>>> On a related topic, I'm wondering if it would make sense to report
>>>>> some more global things (temp, freq, etc) via fdinfo?  Some of this,
>>>>> tools like nvtop could get by trawling sysfs or other driver specific
>>>>> ways.  But maybe it makes sense to have these sort of things reported
>>>>> in a standardized way (even though they aren't really per-drm_file)
>>>>
>>>> I think that's a bit much layering violation, we'd essentially have to
>>>> reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
>>>> be in :-)
>>>
>>> I guess this is true for temp (where there are thermal zones with
>>> potentially multiple temp sensors.. but I'm still digging my way thru
>>> the thermal_cooling_device stuff)
>>
>> It is slightly ugly. All thermal zones and cooling devices are virtual
>> devices (so, even no connection to the particular tsens device). One
>> can either enumerate them by checking
>> /sys/class/thermal/thermal_zoneN/type or enumerate them through
>> /sys/class/hwmon. For cooling devices again the only enumeration is
>> through /sys/class/thermal/cooling_deviceN/type.
>>
>> Probably it should be possible to push cooling devices and thermal
>> zones under corresponding providers. However I do not know if there is
>> a good way to correlate cooling device (ideally a part of GPU) to the
>> thermal_zone (which in our case is provided by tsens / temp_alarm
>> rather than GPU itself).
> 
> There's not even sysfs links to connect the pieces in both ways?

I missed them in the most obvious place:

/sys/class/thermal/thermal_zone1/cdev0 -> ../cooling_device0

So, there is a link from thermal zone to cooling device.

> 
>>> But what about freq?  I think, esp for cases where some "fw thing" is
>>> controlling the freq we end up needing to use gpu counters to measure
>>> the freq.
>>
>> For the freq it is slightly easier: /sys/class/devfreq/*, devices are
>> registered under proper parent (IOW, GPU). So one can read
>> /sys/class/devfreq/3d00000.gpu/cur_freq or
>> /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
>>
>> However because of the components usage, there is no link from
>> /sys/class/drm/card0
>> (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
>> to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
> 
> Hm ... do we need to make component more visible in sysfs, with _looooots_
> of links? Atm it's just not even there.

Maybe. Or maybe we should use DPU (the component master and a parent of 
drm/card0) as devfreq parent too.

> 
>> Getting all these items together in a platform-independent way would
>> be definitely an important but complex topic.
> 
> Yeah this sounds like some work. But also sounds like it's all generic
> issues (thermal zones above and component here) that really should be
> fixed at that level?
> 
> Cheers, Daniel
> 
> 
>>>> What might be needed is better glue to go from the fd or fdinfo to the
>>>> right hw device and then crawl around the hwmon in sysfs automatically. I
>>>> would not be surprised at all if we really suck on this, probably more
>>>> likely on SoC than pci gpus where at least everything should be under the
>>>> main pci sysfs device.
>>>
>>> yeah, I *think* userspace would have to look at /proc/device-tree to
>>> find the cooling device(s) associated with the gpu.. at least I don't
>>> see a straightforward way to figure it out just for sysfs
>>>
>>> BR,
>>> -R
>>>
>>>> -Daniel
>>>>
>>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>>
>>>>>> [1] https://patchwork.freedesktop.org/series/112397/
>>>>>>
>>>>>> Rob Clark (2):
>>>>>>    drm: Add fdinfo memory stats
>>>>>>    drm/msm: Add memory stats to fdinfo
>>>>>>
>>>>>>   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
>>>>>>   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
>>>>>>   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
>>>>>>   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
>>>>>>   include/drm/drm_file.h                | 10 ++++
>>>>>>   5 files changed, 134 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.39.2
>>>>>>
>>>>
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> http://blog.ffwll.ch
>>
>>
>>
>> -- 
>> With best wishes
>> Dmitry
>
Dmitry Baryshkov April 11, 2023, 10:36 p.m. UTC | #8
On 11/04/2023 21:28, Rob Clark wrote:
> On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
>>
>> On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
>>>
>>> On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>
>>>> On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
>>>>> On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
>>>>>>
>>>>>> From: Rob Clark <robdclark@chromium.org>
>>>>>>
>>>>>> Similar motivation to other similar recent attempt[1].  But with an
>>>>>> attempt to have some shared code for this.  As well as documentation.
>>>>>>
>>>>>> It is probably a bit UMA-centric, I guess devices with VRAM might want
>>>>>> some placement stats as well.  But this seems like a reasonable start.
>>>>>>
>>>>>> Basic gputop support: https://patchwork.freedesktop.org/series/116236/
>>>>>> And already nvtop support: https://github.com/Syllo/nvtop/pull/204
>>>>>
>>>>> On a related topic, I'm wondering if it would make sense to report
>>>>> some more global things (temp, freq, etc) via fdinfo?  Some of this,
>>>>> tools like nvtop could get by trawling sysfs or other driver specific
>>>>> ways.  But maybe it makes sense to have these sort of things reported
>>>>> in a standardized way (even though they aren't really per-drm_file)
>>>>
>>>> I think that's a bit much layering violation, we'd essentially have to
>>>> reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
>>>> be in :-)
>>>
>>> I guess this is true for temp (where there are thermal zones with
>>> potentially multiple temp sensors.. but I'm still digging my way thru
>>> the thermal_cooling_device stuff)
>>
>> It is slightly ugly. All thermal zones and cooling devices are virtual
>> devices (so, even no connection to the particular tsens device). One
>> can either enumerate them by checking
>> /sys/class/thermal/thermal_zoneN/type or enumerate them through
>> /sys/class/hwmon. For cooling devices again the only enumeration is
>> through /sys/class/thermal/cooling_deviceN/type.
>>
>> Probably it should be possible to push cooling devices and thermal
>> zones under corresponding providers. However I do not know if there is
>> a good way to correlate cooling device (ideally a part of GPU) to the
>> thermal_zone (which in our case is provided by tsens / temp_alarm
>> rather than GPU itself).
>>
>>>
>>> But what about freq?  I think, esp for cases where some "fw thing" is
>>> controlling the freq we end up needing to use gpu counters to measure
>>> the freq.
>>
>> For the freq it is slightly easier: /sys/class/devfreq/*, devices are
>> registered under proper parent (IOW, GPU). So one can read
>> /sys/class/devfreq/3d00000.gpu/cur_freq or
>> /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
>>
>> However because of the components usage, there is no link from
>> /sys/class/drm/card0
>> (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
>> to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
>>
>> Getting all these items together in a platform-independent way would
>> be definitely an important but complex topic.
> 
> But I don't believe any of the pci gpu's use devfreq ;-)
> 
> And also, you can't expect the CPU to actually know the freq when fw
> is the one controlling freq.  We can, currently, have a reasonable
> approximation from devfreq but that stops if IFPC is implemented.  And
> other GPUs have even less direct control.  So freq is a thing that I
> don't think we should try to get from "common frameworks"

I think it might be useful to add another passive devfreq governor type 
for external frequencies. This way we can use the same interface to 
export non-CPU-controlled frequencies.

> 
> BR,
> -R
> 
>>>
>>>> What might be needed is better glue to go from the fd or fdinfo to the
>>>> right hw device and then crawl around the hwmon in sysfs automatically. I
>>>> would not be surprised at all if we really suck on this, probably more
>>>> likely on SoC than pci gpus where at least everything should be under the
>>>> main pci sysfs device.
>>>
>>> yeah, I *think* userspace would have to look at /proc/device-tree to
>>> find the cooling device(s) associated with the gpu.. at least I don't
>>> see a straightforward way to figure it out just for sysfs
>>>
>>> BR,
>>> -R
>>>
>>>> -Daniel
>>>>
>>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>>
>>>>>> [1] https://patchwork.freedesktop.org/series/112397/
>>>>>>
>>>>>> Rob Clark (2):
>>>>>>    drm: Add fdinfo memory stats
>>>>>>    drm/msm: Add memory stats to fdinfo
>>>>>>
>>>>>>   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
>>>>>>   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
>>>>>>   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
>>>>>>   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
>>>>>>   include/drm/drm_file.h                | 10 ++++
>>>>>>   5 files changed, 134 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.39.2
>>>>>>
>>>>
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> http://blog.ffwll.ch
>>
>>
>>
>> --
>> With best wishes
>> Dmitry
Daniel Vetter April 12, 2023, 8:11 a.m. UTC | #9
On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote:
> On 11/04/2023 21:28, Rob Clark wrote:
> > On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> > <dmitry.baryshkov@linaro.org> wrote:
> > > 
> > > On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> > > > 
> > > > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > 
> > > > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > 
> > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > > 
> > > > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > > > attempt to have some shared code for this.  As well as documentation.
> > > > > > > 
> > > > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > > > > 
> > > > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > > > > 
> > > > > > On a related topic, I'm wondering if it would make sense to report
> > > > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > > > in a standardized way (even though they aren't really per-drm_file)
> > > > > 
> > > > > I think that's a bit much layering violation, we'd essentially have to
> > > > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > > > be in :-)
> > > > 
> > > > I guess this is true for temp (where there are thermal zones with
> > > > potentially multiple temp sensors.. but I'm still digging my way thru
> > > > the thermal_cooling_device stuff)
> > > 
> > > It is slightly ugly. All thermal zones and cooling devices are virtual
> > > devices (so, even no connection to the particular tsens device). One
> > > can either enumerate them by checking
> > > /sys/class/thermal/thermal_zoneN/type or enumerate them through
> > > /sys/class/hwmon. For cooling devices again the only enumeration is
> > > through /sys/class/thermal/cooling_deviceN/type.
> > > 
> > > Probably it should be possible to push cooling devices and thermal
> > > zones under corresponding providers. However I do not know if there is
> > > a good way to correlate cooling device (ideally a part of GPU) to the
> > > thermal_zone (which in our case is provided by tsens / temp_alarm
> > > rather than GPU itself).
> > > 
> > > > 
> > > > But what about freq?  I think, esp for cases where some "fw thing" is
> > > > controlling the freq we end up needing to use gpu counters to measure
> > > > the freq.
> > > 
> > > For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> > > registered under proper parent (IOW, GPU). So one can read
> > > /sys/class/devfreq/3d00000.gpu/cur_freq or
> > > /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
> > > 
> > > However because of the components usage, there is no link from
> > > /sys/class/drm/card0
> > > (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> > > to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
> > > 
> > > Getting all these items together in a platform-independent way would
> > > be definitely an important but complex topic.
> > 
> > But I don't believe any of the pci gpu's use devfreq ;-)
> > 
> > And also, you can't expect the CPU to actually know the freq when fw
> > is the one controlling freq.  We can, currently, have a reasonable
> > approximation from devfreq but that stops if IFPC is implemented.  And
> > other GPUs have even less direct control.  So freq is a thing that I
> > don't think we should try to get from "common frameworks"
> 
> I think it might be useful to add another passive devfreq governor type for
> external frequencies. This way we can use the same interface to export
> non-CPU-controlled frequencies.

Yeah this sounds like a decent idea to me too. It might also solve the fun
of various pci devices having very non-standard freq controls in sysfs
(looking at least at i915 here ...)

I guess it would minimally be a good idea if we could document this, or
maybe have a reference implementation in nvtop or whatever the cool thing
is rn.
-Daniel

> 
> > 
> > BR,
> > -R
> > 
> > > > 
> > > > > What might be needed is better glue to go from the fd or fdinfo to the
> > > > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > > > would not be surprised at all if we really suck on this, probably more
> > > > > likely on SoC than pci gpus where at least everything should be under the
> > > > > main pci sysfs device.
> > > > 
> > > > yeah, I *think* userspace would have to look at /proc/device-tree to
> > > > find the cooling device(s) associated with the gpu.. at least I don't
> > > > see a straightforward way to figure it out just for sysfs
> > > > 
> > > > BR,
> > > > -R
> > > > 
> > > > > -Daniel
> > > > > 
> > > > > > 
> > > > > > BR,
> > > > > > -R
> > > > > > 
> > > > > > 
> > > > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > > > > 
> > > > > > > Rob Clark (2):
> > > > > > >    drm: Add fdinfo memory stats
> > > > > > >    drm/msm: Add memory stats to fdinfo
> > > > > > > 
> > > > > > >   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > > > >   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > > > >   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > > > >   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > > > >   include/drm/drm_file.h                | 10 ++++
> > > > > > >   5 files changed, 134 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > > --
> > > > > > > 2.39.2
> > > > > > > 
> > > > > 
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > 
> > > 
> > > 
> > > --
> > > With best wishes
> > > Dmitry
> 
> -- 
> With best wishes
> Dmitry
>
Rodrigo Vivi April 12, 2023, 12:47 p.m. UTC | #10
On Wed, Apr 12, 2023 at 10:11:32AM +0200, Daniel Vetter wrote:
> On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote:
> > On 11/04/2023 21:28, Rob Clark wrote:
> > > On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> > > <dmitry.baryshkov@linaro.org> wrote:
> > > > 
> > > > On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> > > > > 
> > > > > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > 
> > > > > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > 
> > > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > > > 
> > > > > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > > > > attempt to have some shared code for this.  As well as documentation.
> > > > > > > > 
> > > > > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > > > > > 
> > > > > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > > > > > 
> > > > > > > On a related topic, I'm wondering if it would make sense to report
> > > > > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > > > > in a standardized way (even though they aren't really per-drm_file)
> > > > > > 
> > > > > > I think that's a bit much layering violation, we'd essentially have to
> > > > > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > > > > be in :-)
> > > > > 
> > > > > I guess this is true for temp (where there are thermal zones with
> > > > > potentially multiple temp sensors.. but I'm still digging my way thru
> > > > > the thermal_cooling_device stuff)
> > > > 
> > > > It is slightly ugly. All thermal zones and cooling devices are virtual
> > > > devices (so, even no connection to the particular tsens device). One
> > > > can either enumerate them by checking
> > > > /sys/class/thermal/thermal_zoneN/type or enumerate them through
> > > > /sys/class/hwmon. For cooling devices again the only enumeration is
> > > > through /sys/class/thermal/cooling_deviceN/type.
> > > > 
> > > > Probably it should be possible to push cooling devices and thermal
> > > > zones under corresponding providers. However I do not know if there is
> > > > a good way to correlate cooling device (ideally a part of GPU) to the
> > > > thermal_zone (which in our case is provided by tsens / temp_alarm
> > > > rather than GPU itself).
> > > > 
> > > > > 
> > > > > But what about freq?  I think, esp for cases where some "fw thing" is
> > > > > controlling the freq we end up needing to use gpu counters to measure
> > > > > the freq.
> > > > 
> > > > For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> > > > registered under proper parent (IOW, GPU). So one can read
> > > > /sys/class/devfreq/3d00000.gpu/cur_freq or
> > > > /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
> > > > 
> > > > However because of the components usage, there is no link from
> > > > /sys/class/drm/card0
> > > > (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> > > > to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
> > > > 
> > > > Getting all these items together in a platform-independent way would
> > > > be definitely an important but complex topic.
> > > 
> > > But I don't believe any of the pci gpu's use devfreq ;-)
> > > 
> > > And also, you can't expect the CPU to actually know the freq when fw
> > > is the one controlling freq.  We can, currently, have a reasonable
> > > approximation from devfreq but that stops if IFPC is implemented.  And
> > > other GPUs have even less direct control.  So freq is a thing that I
> > > don't think we should try to get from "common frameworks"
> > 
> > I think it might be useful to add another passive devfreq governor type for
> > external frequencies. This way we can use the same interface to export
> > non-CPU-controlled frequencies.
> 
> Yeah this sounds like a decent idea to me too. It might also solve the fun
> of various pci devices having very non-standard freq controls in sysfs
> (looking at least at i915 here ...)

I also like the idea of having some common infrastructure for the GPU freq.

hwmon have a good infrastructure, but they are more focused on individual
monitoring devices and not very welcomed to embedded monitoring and control.
I still want to check the opportunity to see if at least some freq control
could be aligned there.

Another thing that complicates that is that there are multiple frequency
domains and controls with multipliers in Intel GPU that are not very
standard or easy to integrate.

On a quick glace this devfreq seems neat because it aligns with the cpufreq
and governors. But again it would be hard to align with the multiple domains
and controls. But it deserves a look.

I will take a look to both fronts for Xe: hwmon and devfreq. Right now on
Xe we have a lot less controls than i915, but I can imagine soon there
will be requirements to make that to grow and I fear that we end up just
like i915. So I will take a look before that happens.

> 
> I guess it would minimally be a good idea if we could document this, or
> maybe have a reference implementation in nvtop or whatever the cool thing
> is rn.
> -Daniel
> 
> > 
> > > 
> > > BR,
> > > -R
> > > 
> > > > > 
> > > > > > What might be needed is better glue to go from the fd or fdinfo to the
> > > > > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > > > > would not be surprised at all if we really suck on this, probably more
> > > > > > likely on SoC than pci gpus where at least everything should be under the
> > > > > > main pci sysfs device.
> > > > > 
> > > > > yeah, I *think* userspace would have to look at /proc/device-tree to
> > > > > find the cooling device(s) associated with the gpu.. at least I don't
> > > > > see a straightforward way to figure it out just for sysfs
> > > > > 
> > > > > BR,
> > > > > -R
> > > > > 
> > > > > > -Daniel
> > > > > > 
> > > > > > > 
> > > > > > > BR,
> > > > > > > -R
> > > > > > > 
> > > > > > > 
> > > > > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > > > > > 
> > > > > > > > Rob Clark (2):
> > > > > > > >    drm: Add fdinfo memory stats
> > > > > > > >    drm/msm: Add memory stats to fdinfo
> > > > > > > > 
> > > > > > > >   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > > > > >   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > > > > >   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > > > > >   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > > > > >   include/drm/drm_file.h                | 10 ++++
> > > > > > > >   5 files changed, 134 insertions(+), 3 deletions(-)
> > > > > > > > 
> > > > > > > > --
> > > > > > > > 2.39.2
> > > > > > > > 
> > > > > > 
> > > > > > --
> > > > > > Daniel Vetter
> > > > > > Software Engineer, Intel Corporation
> > > > > > http://blog.ffwll.ch
> > > > 
> > > > 
> > > > 
> > > > --
> > > > With best wishes
> > > > Dmitry
> > 
> > -- 
> > With best wishes
> > Dmitry
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Rob Clark April 12, 2023, 8:09 p.m. UTC | #11
On Wed, Apr 12, 2023 at 5:47 AM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>
> On Wed, Apr 12, 2023 at 10:11:32AM +0200, Daniel Vetter wrote:
> > On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote:
> > > On 11/04/2023 21:28, Rob Clark wrote:
> > > > On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > >
> > > > > On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > >
> > > > > > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > > > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > > > >
> > > > > > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > > > > > attempt to have some shared code for this.  As well as documentation.
> > > > > > > > >
> > > > > > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > > > > > >
> > > > > > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > > > > > >
> > > > > > > > On a related topic, I'm wondering if it would make sense to report
> > > > > > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > > > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > > > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > > > > > in a standardized way (even though they aren't really per-drm_file)
> > > > > > >
> > > > > > > I think that's a bit much layering violation, we'd essentially have to
> > > > > > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > > > > > be in :-)
> > > > > >
> > > > > > I guess this is true for temp (where there are thermal zones with
> > > > > > potentially multiple temp sensors.. but I'm still digging my way thru
> > > > > > the thermal_cooling_device stuff)
> > > > >
> > > > > It is slightly ugly. All thermal zones and cooling devices are virtual
> > > > > devices (so, even no connection to the particular tsens device). One
> > > > > can either enumerate them by checking
> > > > > /sys/class/thermal/thermal_zoneN/type or enumerate them through
> > > > > /sys/class/hwmon. For cooling devices again the only enumeration is
> > > > > through /sys/class/thermal/cooling_deviceN/type.
> > > > >
> > > > > Probably it should be possible to push cooling devices and thermal
> > > > > zones under corresponding providers. However I do not know if there is
> > > > > a good way to correlate cooling device (ideally a part of GPU) to the
> > > > > thermal_zone (which in our case is provided by tsens / temp_alarm
> > > > > rather than GPU itself).
> > > > >
> > > > > >
> > > > > > But what about freq?  I think, esp for cases where some "fw thing" is
> > > > > > controlling the freq we end up needing to use gpu counters to measure
> > > > > > the freq.
> > > > >
> > > > > For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> > > > > registered under proper parent (IOW, GPU). So one can read
> > > > > /sys/class/devfreq/3d00000.gpu/cur_freq or
> > > > > /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
> > > > >
> > > > > However because of the components usage, there is no link from
> > > > > /sys/class/drm/card0
> > > > > (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> > > > > to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
> > > > >
> > > > > Getting all these items together in a platform-independent way would
> > > > > be definitely an important but complex topic.
> > > >
> > > > But I don't believe any of the pci gpu's use devfreq ;-)
> > > >
> > > > And also, you can't expect the CPU to actually know the freq when fw
> > > > is the one controlling freq.  We can, currently, have a reasonable
> > > > approximation from devfreq but that stops if IFPC is implemented.  And
> > > > other GPUs have even less direct control.  So freq is a thing that I
> > > > don't think we should try to get from "common frameworks"
> > >
> > > I think it might be useful to add another passive devfreq governor type for
> > > external frequencies. This way we can use the same interface to export
> > > non-CPU-controlled frequencies.
> >
> > Yeah this sounds like a decent idea to me too. It might also solve the fun
> > of various pci devices having very non-standard freq controls in sysfs
> > (looking at least at i915 here ...)
>
> I also like the idea of having some common infrastructure for the GPU freq.
>
> hwmon have a good infrastructure, but they are more focused on individual
> monitoring devices and not very welcomed to embedded monitoring and control.
> I still want to check the opportunity to see if at least some freq control
> could be aligned there.
>
> Another thing that complicates that is that there are multiple frequency
> domains and controls with multipliers in Intel GPU that are not very
> standard or easy to integrate.
>
> On a quick glace this devfreq seems neat because it aligns with the cpufreq
> and governors. But again it would be hard to align with the multiple domains
> and controls. But it deserves a look.
>
> I will take a look to both fronts for Xe: hwmon and devfreq. Right now on
> Xe we have a lot less controls than i915, but I can imagine soon there
> will be requirements to make that to grow and I fear that we end up just
> like i915. So I will take a look before that happens.

So it looks like i915 (dgpu only) and nouveau already use hwmon.. so
maybe this is a good way to expose temp.  Maybe we can wire up some
sort of helper for drivers which use thermal_cooling_device (which can
be composed of multiple sensors) to give back an aggregate temp for
hwmon to report?

Freq could possibly be added to hwmon (ie. seems like a reasonable
attribute to add).  Devfreq might also be an option but on arm it
isn't necessarily associated with the drm device, whereas we could
associate the hwmon with the drm device to make it easier for
userspace to find.

BR,
-R

> >
> > I guess it would minimally be a good idea if we could document this, or
> > maybe have a reference implementation in nvtop or whatever the cool thing
> > is rn.
> > -Daniel
> >
> > >
> > > >
> > > > BR,
> > > > -R
> > > >
> > > > > >
> > > > > > > What might be needed is better glue to go from the fd or fdinfo to the
> > > > > > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > > > > > would not be surprised at all if we really suck on this, probably more
> > > > > > > likely on SoC than pci gpus where at least everything should be under the
> > > > > > > main pci sysfs device.
> > > > > >
> > > > > > yeah, I *think* userspace would have to look at /proc/device-tree to
> > > > > > find the cooling device(s) associated with the gpu.. at least I don't
> > > > > > see a straightforward way to figure it out just for sysfs
> > > > > >
> > > > > > BR,
> > > > > > -R
> > > > > >
> > > > > > > -Daniel
> > > > > > >
> > > > > > > >
> > > > > > > > BR,
> > > > > > > > -R
> > > > > > > >
> > > > > > > >
> > > > > > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > > > > > >
> > > > > > > > > Rob Clark (2):
> > > > > > > > >    drm: Add fdinfo memory stats
> > > > > > > > >    drm/msm: Add memory stats to fdinfo
> > > > > > > > >
> > > > > > > > >   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > > > > > >   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > > > > > >   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > > > > > >   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > > > > > >   include/drm/drm_file.h                | 10 ++++
> > > > > > > > >   5 files changed, 134 insertions(+), 3 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.39.2
> > > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Daniel Vetter
> > > > > > > Software Engineer, Intel Corporation
> > > > > > > http://blog.ffwll.ch
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > With best wishes
> > > > > Dmitry
> > >
> > > --
> > > With best wishes
> > > Dmitry
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
Dmitry Baryshkov April 12, 2023, 8:19 p.m. UTC | #12
On Wed, 12 Apr 2023 at 23:09, Rob Clark <robdclark@gmail.com> wrote:
>
> On Wed, Apr 12, 2023 at 5:47 AM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> >
> > On Wed, Apr 12, 2023 at 10:11:32AM +0200, Daniel Vetter wrote:
> > > On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote:
> > > > On 11/04/2023 21:28, Rob Clark wrote:
> > > > > On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > >
> > > > > > On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > >
> > > > > > > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > > > > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > > > > >
> > > > > > > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > > > > > > attempt to have some shared code for this.  As well as documentation.
> > > > > > > > > >
> > > > > > > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > > > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > > > > > > >
> > > > > > > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > > > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > > > > > > >
> > > > > > > > > On a related topic, I'm wondering if it would make sense to report
> > > > > > > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > > > > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > > > > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > > > > > > in a standardized way (even though they aren't really per-drm_file)
> > > > > > > >
> > > > > > > > I think that's a bit much layering violation, we'd essentially have to
> > > > > > > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > > > > > > be in :-)
> > > > > > >
> > > > > > > I guess this is true for temp (where there are thermal zones with
> > > > > > > potentially multiple temp sensors.. but I'm still digging my way thru
> > > > > > > the thermal_cooling_device stuff)
> > > > > >
> > > > > > It is slightly ugly. All thermal zones and cooling devices are virtual
> > > > > > devices (so, even no connection to the particular tsens device). One
> > > > > > can either enumerate them by checking
> > > > > > /sys/class/thermal/thermal_zoneN/type or enumerate them through
> > > > > > /sys/class/hwmon. For cooling devices again the only enumeration is
> > > > > > through /sys/class/thermal/cooling_deviceN/type.
> > > > > >
> > > > > > Probably it should be possible to push cooling devices and thermal
> > > > > > zones under corresponding providers. However I do not know if there is
> > > > > > a good way to correlate cooling device (ideally a part of GPU) to the
> > > > > > thermal_zone (which in our case is provided by tsens / temp_alarm
> > > > > > rather than GPU itself).
> > > > > >
> > > > > > >
> > > > > > > But what about freq?  I think, esp for cases where some "fw thing" is
> > > > > > > controlling the freq we end up needing to use gpu counters to measure
> > > > > > > the freq.
> > > > > >
> > > > > > For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> > > > > > registered under proper parent (IOW, GPU). So one can read
> > > > > > /sys/class/devfreq/3d00000.gpu/cur_freq or
> > > > > > /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
> > > > > >
> > > > > > However because of the components usage, there is no link from
> > > > > > /sys/class/drm/card0
> > > > > > (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> > > > > > to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
> > > > > >
> > > > > > Getting all these items together in a platform-independent way would
> > > > > > be definitely an important but complex topic.
> > > > >
> > > > > But I don't believe any of the pci gpu's use devfreq ;-)
> > > > >
> > > > > And also, you can't expect the CPU to actually know the freq when fw
> > > > > is the one controlling freq.  We can, currently, have a reasonable
> > > > > approximation from devfreq but that stops if IFPC is implemented.  And
> > > > > other GPUs have even less direct control.  So freq is a thing that I
> > > > > don't think we should try to get from "common frameworks"
> > > >
> > > > I think it might be useful to add another passive devfreq governor type for
> > > > external frequencies. This way we can use the same interface to export
> > > > non-CPU-controlled frequencies.
> > >
> > > Yeah this sounds like a decent idea to me too. It might also solve the fun
> > > of various pci devices having very non-standard freq controls in sysfs
> > > (looking at least at i915 here ...)
> >
> > I also like the idea of having some common infrastructure for the GPU freq.
> >
> > hwmon have a good infrastructure, but they are more focused on individual
> > monitoring devices and not very welcomed to embedded monitoring and control.
> > I still want to check the opportunity to see if at least some freq control
> > could be aligned there.
> >
> > Another thing that complicates that is that there are multiple frequency
> > domains and controls with multipliers in Intel GPU that are not very
> > standard or easy to integrate.
> >
> > On a quick glace this devfreq seems neat because it aligns with the cpufreq
> > and governors. But again it would be hard to align with the multiple domains
> > and controls. But it deserves a look.
> >
> > I will take a look to both fronts for Xe: hwmon and devfreq. Right now on
> > Xe we have a lot less controls than i915, but I can imagine soon there
> > will be requirements to make that to grow and I fear that we end up just
> > like i915. So I will take a look before that happens.
>
> So it looks like i915 (dgpu only) and nouveau already use hwmon.. so
> maybe this is a good way to expose temp.  Maybe we can wire up some
> sort of helper for drivers which use thermal_cooling_device (which can
> be composed of multiple sensors) to give back an aggregate temp for
> hwmon to report?

The thermal_device already registers the hwmon, see below. The
question is about linking that hwmon to the drm. Strictly speaking, I
don't think that we can reexport it in a clean way.

# grep gpu /sys/class/hwmon/hwmon*/name
/sys/class/hwmon/hwmon15/name:gpu_top_thermal
/sys/class/hwmon/hwmon24/name:gpu_bottom_thermal
# ls /sys/class/hwmon/hwmon15/ -l
lrwxrwxrwx    1 root     root             0 Jan 26 08:14 device ->
../../thermal_zone15
-r--r--r--    1 root     root          4096 Jan 26 08:14 name
drwxr-xr-x    2 root     root             0 Jan 26 08:15 power
lrwxrwxrwx    1 root     root             0 Jan 26 08:12 subsystem ->
../../../../../class/hwmon
-r--r--r--    1 root     root          4096 Jan 26 08:14 temp1_input
-rw-r--r--    1 root     root          4096 Jan 26 08:12 uevent

> Freq could possibly be added to hwmon (ie. seems like a reasonable
> attribute to add).  Devfreq might also be an option but on arm it
> isn't necessarily associated with the drm device, whereas we could
> associate the hwmon with the drm device to make it easier for
> userspace to find.

Possibly we can register a virtual 'passive' devfreq being driven by
another active devfreq device.

>
> BR,
> -R
>
> > >
> > > I guess it would minimally be a good idea if we could document this, or
> > > maybe have a reference implementation in nvtop or whatever the cool thing
> > > is rn.
> > > -Daniel
> > >
> > > >
> > > > >
> > > > > BR,
> > > > > -R
> > > > >
> > > > > > >
> > > > > > > > What might be needed is better glue to go from the fd or fdinfo to the
> > > > > > > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > > > > > > would not be surprised at all if we really suck on this, probably more
> > > > > > > > likely on SoC than pci gpus where at least everything should be under the
> > > > > > > > main pci sysfs device.
> > > > > > >
> > > > > > > yeah, I *think* userspace would have to look at /proc/device-tree to
> > > > > > > find the cooling device(s) associated with the gpu.. at least I don't
> > > > > > > see a straightforward way to figure it out just for sysfs
> > > > > > >
> > > > > > > BR,
> > > > > > > -R
> > > > > > >
> > > > > > > > -Daniel
> > > > > > > >
> > > > > > > > >
> > > > > > > > > BR,
> > > > > > > > > -R
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > > > > > > >
> > > > > > > > > > Rob Clark (2):
> > > > > > > > > >    drm: Add fdinfo memory stats
> > > > > > > > > >    drm/msm: Add memory stats to fdinfo
> > > > > > > > > >
> > > > > > > > > >   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > > > > > > >   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > > > > > > >   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > > > > > > >   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > > > > > > >   include/drm/drm_file.h                | 10 ++++
> > > > > > > > > >   5 files changed, 134 insertions(+), 3 deletions(-)
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > 2.39.2
> > > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Daniel Vetter
> > > > > > > > Software Engineer, Intel Corporation
> > > > > > > > http://blog.ffwll.ch
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > With best wishes
> > > > > > Dmitry
> > > >
> > > > --
> > > > With best wishes
> > > > Dmitry
> > > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
Alex Deucher April 12, 2023, 8:23 p.m. UTC | #13
On Wed, Apr 12, 2023 at 4:10 PM Rob Clark <robdclark@gmail.com> wrote:
>
> On Wed, Apr 12, 2023 at 5:47 AM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> >
> > On Wed, Apr 12, 2023 at 10:11:32AM +0200, Daniel Vetter wrote:
> > > On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote:
> > > > On 11/04/2023 21:28, Rob Clark wrote:
> > > > > On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > >
> > > > > > On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > >
> > > > > > > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > > > > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > > > > >
> > > > > > > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > > > > > > attempt to have some shared code for this.  As well as documentation.
> > > > > > > > > >
> > > > > > > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > > > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > > > > > > >
> > > > > > > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > > > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > > > > > > >
> > > > > > > > > On a related topic, I'm wondering if it would make sense to report
> > > > > > > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > > > > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > > > > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > > > > > > in a standardized way (even though they aren't really per-drm_file)
> > > > > > > >
> > > > > > > > I think that's a bit much layering violation, we'd essentially have to
> > > > > > > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > > > > > > be in :-)
> > > > > > >
> > > > > > > I guess this is true for temp (where there are thermal zones with
> > > > > > > potentially multiple temp sensors.. but I'm still digging my way thru
> > > > > > > the thermal_cooling_device stuff)
> > > > > >
> > > > > > It is slightly ugly. All thermal zones and cooling devices are virtual
> > > > > > devices (so, even no connection to the particular tsens device). One
> > > > > > can either enumerate them by checking
> > > > > > /sys/class/thermal/thermal_zoneN/type or enumerate them through
> > > > > > /sys/class/hwmon. For cooling devices again the only enumeration is
> > > > > > through /sys/class/thermal/cooling_deviceN/type.
> > > > > >
> > > > > > Probably it should be possible to push cooling devices and thermal
> > > > > > zones under corresponding providers. However I do not know if there is
> > > > > > a good way to correlate cooling device (ideally a part of GPU) to the
> > > > > > thermal_zone (which in our case is provided by tsens / temp_alarm
> > > > > > rather than GPU itself).
> > > > > >
> > > > > > >
> > > > > > > But what about freq?  I think, esp for cases where some "fw thing" is
> > > > > > > controlling the freq we end up needing to use gpu counters to measure
> > > > > > > the freq.
> > > > > >
> > > > > > For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> > > > > > registered under proper parent (IOW, GPU). So one can read
> > > > > > /sys/class/devfreq/3d00000.gpu/cur_freq or
> > > > > > /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
> > > > > >
> > > > > > However because of the components usage, there is no link from
> > > > > > /sys/class/drm/card0
> > > > > > (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> > > > > > to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
> > > > > >
> > > > > > Getting all these items together in a platform-independent way would
> > > > > > be definitely an important but complex topic.
> > > > >
> > > > > But I don't believe any of the pci gpu's use devfreq ;-)
> > > > >
> > > > > And also, you can't expect the CPU to actually know the freq when fw
> > > > > is the one controlling freq.  We can, currently, have a reasonable
> > > > > approximation from devfreq but that stops if IFPC is implemented.  And
> > > > > other GPUs have even less direct control.  So freq is a thing that I
> > > > > don't think we should try to get from "common frameworks"
> > > >
> > > > I think it might be useful to add another passive devfreq governor type for
> > > > external frequencies. This way we can use the same interface to export
> > > > non-CPU-controlled frequencies.
> > >
> > > Yeah this sounds like a decent idea to me too. It might also solve the fun
> > > of various pci devices having very non-standard freq controls in sysfs
> > > (looking at least at i915 here ...)
> >
> > I also like the idea of having some common infrastructure for the GPU freq.
> >
> > hwmon have a good infrastructure, but they are more focused on individual
> > monitoring devices and not very welcomed to embedded monitoring and control.
> > I still want to check the opportunity to see if at least some freq control
> > could be aligned there.
> >
> > Another thing that complicates that is that there are multiple frequency
> > domains and controls with multipliers in Intel GPU that are not very
> > standard or easy to integrate.
> >
> > On a quick glace this devfreq seems neat because it aligns with the cpufreq
> > and governors. But again it would be hard to align with the multiple domains
> > and controls. But it deserves a look.
> >
> > I will take a look to both fronts for Xe: hwmon and devfreq. Right now on
> > Xe we have a lot less controls than i915, but I can imagine soon there
> > will be requirements to make that to grow and I fear that we end up just
> > like i915. So I will take a look before that happens.
>
> So it looks like i915 (dgpu only) and nouveau already use hwmon.. so
> maybe this is a good way to expose temp.  Maybe we can wire up some
> sort of helper for drivers which use thermal_cooling_device (which can
> be composed of multiple sensors) to give back an aggregate temp for
> hwmon to report?

amdgpu uses hwmon as well for temp, voltage, power, etc.  Once of the
problems with hwmon is that it's designed around individual sensors.
However, on the GPU at least, most customers, at least in the
datacenter, want an atomic view of all of the attributes.  It would be
nice if there were some way to get nice snapshot of all of the
attributes at one time.

>
> Freq could possibly be added to hwmon (ie. seems like a reasonable
> attribute to add).  Devfreq might also be an option but on arm it
> isn't necessarily associated with the drm device, whereas we could
> associate the hwmon with the drm device to make it easier for
> userspace to find.

freq attributes seem natural for hwmon, at least for reporting.  I'm
not familiar with devfreq; I wonder if it's flexible enough to deal
with devices that might have full or partial firmware control of the
frequencies.  Moreover, each clock domain is not necessarily
independent.  You might have multiple clock domains with different
voltage, thermal, and tdp dependencies.  Power limits are controlled
via hwmon and you may need to adjust them in order to make certain
clock changes.  Then add in overclocking support on top and it gets
more complex.

Alex

>
> BR,
> -R
>
> > >
> > > I guess it would minimally be a good idea if we could document this, or
> > > maybe have a reference implementation in nvtop or whatever the cool thing
> > > is rn.
> > > -Daniel
> > >
> > > >
> > > > >
> > > > > BR,
> > > > > -R
> > > > >
> > > > > > >
> > > > > > > > What might be needed is better glue to go from the fd or fdinfo to the
> > > > > > > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > > > > > > would not be surprised at all if we really suck on this, probably more
> > > > > > > > likely on SoC than pci gpus where at least everything should be under the
> > > > > > > > main pci sysfs device.
> > > > > > >
> > > > > > > yeah, I *think* userspace would have to look at /proc/device-tree to
> > > > > > > find the cooling device(s) associated with the gpu.. at least I don't
> > > > > > > see a straightforward way to figure it out just for sysfs
> > > > > > >
> > > > > > > BR,
> > > > > > > -R
> > > > > > >
> > > > > > > > -Daniel
> > > > > > > >
> > > > > > > > >
> > > > > > > > > BR,
> > > > > > > > > -R
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > > > > > > >
> > > > > > > > > > Rob Clark (2):
> > > > > > > > > >    drm: Add fdinfo memory stats
> > > > > > > > > >    drm/msm: Add memory stats to fdinfo
> > > > > > > > > >
> > > > > > > > > >   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > > > > > > >   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > > > > > > >   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > > > > > > >   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > > > > > > >   include/drm/drm_file.h                | 10 ++++
> > > > > > > > > >   5 files changed, 134 insertions(+), 3 deletions(-)
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > 2.39.2
> > > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Daniel Vetter
> > > > > > > > Software Engineer, Intel Corporation
> > > > > > > > http://blog.ffwll.ch
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > With best wishes
> > > > > > Dmitry
> > > >
> > > > --
> > > > With best wishes
> > > > Dmitry
> > > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
Rob Clark April 12, 2023, 8:34 p.m. UTC | #14
On Wed, Apr 12, 2023 at 1:19 PM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Wed, 12 Apr 2023 at 23:09, Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Wed, Apr 12, 2023 at 5:47 AM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > >
> > > On Wed, Apr 12, 2023 at 10:11:32AM +0200, Daniel Vetter wrote:
> > > > On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote:
> > > > > On 11/04/2023 21:28, Rob Clark wrote:
> > > > > > On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
> > > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > > >
> > > > > > > On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
> > > > > > > > > > On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > > > > > >
> > > > > > > > > > > Similar motivation to other similar recent attempt[1].  But with an
> > > > > > > > > > > attempt to have some shared code for this.  As well as documentation.
> > > > > > > > > > >
> > > > > > > > > > > It is probably a bit UMA-centric, I guess devices with VRAM might want
> > > > > > > > > > > some placement stats as well.  But this seems like a reasonable start.
> > > > > > > > > > >
> > > > > > > > > > > Basic gputop support: https://patchwork.freedesktop.org/series/116236/
> > > > > > > > > > > And already nvtop support: https://github.com/Syllo/nvtop/pull/204
> > > > > > > > > >
> > > > > > > > > > On a related topic, I'm wondering if it would make sense to report
> > > > > > > > > > some more global things (temp, freq, etc) via fdinfo?  Some of this,
> > > > > > > > > > tools like nvtop could get by trawling sysfs or other driver specific
> > > > > > > > > > ways.  But maybe it makes sense to have these sort of things reported
> > > > > > > > > > in a standardized way (even though they aren't really per-drm_file)
> > > > > > > > >
> > > > > > > > > I think that's a bit much layering violation, we'd essentially have to
> > > > > > > > > reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
> > > > > > > > > be in :-)
> > > > > > > >
> > > > > > > > I guess this is true for temp (where there are thermal zones with
> > > > > > > > potentially multiple temp sensors.. but I'm still digging my way thru
> > > > > > > > the thermal_cooling_device stuff)
> > > > > > >
> > > > > > > It is slightly ugly. All thermal zones and cooling devices are virtual
> > > > > > > devices (so, even no connection to the particular tsens device). One
> > > > > > > can either enumerate them by checking
> > > > > > > /sys/class/thermal/thermal_zoneN/type or enumerate them through
> > > > > > > /sys/class/hwmon. For cooling devices again the only enumeration is
> > > > > > > through /sys/class/thermal/cooling_deviceN/type.
> > > > > > >
> > > > > > > Probably it should be possible to push cooling devices and thermal
> > > > > > > zones under corresponding providers. However I do not know if there is
> > > > > > > a good way to correlate cooling device (ideally a part of GPU) to the
> > > > > > > thermal_zone (which in our case is provided by tsens / temp_alarm
> > > > > > > rather than GPU itself).
> > > > > > >
> > > > > > > >
> > > > > > > > But what about freq?  I think, esp for cases where some "fw thing" is
> > > > > > > > controlling the freq we end up needing to use gpu counters to measure
> > > > > > > > the freq.
> > > > > > >
> > > > > > > For the freq it is slightly easier: /sys/class/devfreq/*, devices are
> > > > > > > registered under proper parent (IOW, GPU). So one can read
> > > > > > > /sys/class/devfreq/3d00000.gpu/cur_freq or
> > > > > > > /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
> > > > > > >
> > > > > > > However because of the components usage, there is no link from
> > > > > > > /sys/class/drm/card0
> > > > > > > (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
> > > > > > > to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
> > > > > > >
> > > > > > > Getting all these items together in a platform-independent way would
> > > > > > > be definitely an important but complex topic.
> > > > > >
> > > > > > But I don't believe any of the pci gpu's use devfreq ;-)
> > > > > >
> > > > > > And also, you can't expect the CPU to actually know the freq when fw
> > > > > > is the one controlling freq.  We can, currently, have a reasonable
> > > > > > approximation from devfreq but that stops if IFPC is implemented.  And
> > > > > > other GPUs have even less direct control.  So freq is a thing that I
> > > > > > don't think we should try to get from "common frameworks"
> > > > >
> > > > > I think it might be useful to add another passive devfreq governor type for
> > > > > external frequencies. This way we can use the same interface to export
> > > > > non-CPU-controlled frequencies.
> > > >
> > > > Yeah this sounds like a decent idea to me too. It might also solve the fun
> > > > of various pci devices having very non-standard freq controls in sysfs
> > > > (looking at least at i915 here ...)
> > >
> > > I also like the idea of having some common infrastructure for the GPU freq.
> > >
> > > hwmon have a good infrastructure, but they are more focused on individual
> > > monitoring devices and not very welcomed to embedded monitoring and control.
> > > I still want to check the opportunity to see if at least some freq control
> > > could be aligned there.
> > >
> > > Another thing that complicates that is that there are multiple frequency
> > > domains and controls with multipliers in Intel GPU that are not very
> > > standard or easy to integrate.
> > >
> > > On a quick glace this devfreq seems neat because it aligns with the cpufreq
> > > and governors. But again it would be hard to align with the multiple domains
> > > and controls. But it deserves a look.
> > >
> > > I will take a look to both fronts for Xe: hwmon and devfreq. Right now on
> > > Xe we have a lot less controls than i915, but I can imagine soon there
> > > will be requirements to make that to grow and I fear that we end up just
> > > like i915. So I will take a look before that happens.
> >
> > So it looks like i915 (dgpu only) and nouveau already use hwmon.. so
> > maybe this is a good way to expose temp.  Maybe we can wire up some
> > sort of helper for drivers which use thermal_cooling_device (which can
> > be composed of multiple sensors) to give back an aggregate temp for
> > hwmon to report?
>
> The thermal_device already registers the hwmon, see below. The
> question is about linking that hwmon to the drm. Strictly speaking, I
> don't think that we can reexport it in a clean way.
>
> # grep gpu /sys/class/hwmon/hwmon*/name
> /sys/class/hwmon/hwmon15/name:gpu_top_thermal
> /sys/class/hwmon/hwmon24/name:gpu_bottom_thermal

I can't get excited about userspace relying on naming conventions or
other heuristics like this.  Also, userspace's view of the world is
very much that there is a "gpu card", not a collection of parts.
(Windows seems to have the same view of the world.)  So we have the
component framework to assemble the various parts together into the
"device" that userspace expects to deal with.  We need to do something
similar for exposing temp and freq.

> # ls /sys/class/hwmon/hwmon15/ -l
> lrwxrwxrwx    1 root     root             0 Jan 26 08:14 device ->
> ../../thermal_zone15
> -r--r--r--    1 root     root          4096 Jan 26 08:14 name
> drwxr-xr-x    2 root     root             0 Jan 26 08:15 power
> lrwxrwxrwx    1 root     root             0 Jan 26 08:12 subsystem ->
> ../../../../../class/hwmon
> -r--r--r--    1 root     root          4096 Jan 26 08:14 temp1_input
> -rw-r--r--    1 root     root          4096 Jan 26 08:12 uevent
>
> > Freq could possibly be added to hwmon (ie. seems like a reasonable
> > attribute to add).  Devfreq might also be an option but on arm it
> > isn't necessarily associated with the drm device, whereas we could
> > associate the hwmon with the drm device to make it easier for
> > userspace to find.
>
> Possibly we can register a virtual 'passive' devfreq being driven by
> another active devfreq device.

That's all fine and good, but it has the same problem that existing
hwmon's associated with the cooling-device have..

BR,
-R

> >
> > BR,
> > -R
> >
> > > >
> > > > I guess it would minimally be a good idea if we could document this, or
> > > > maybe have a reference implementation in nvtop or whatever the cool thing
> > > > is rn.
> > > > -Daniel
> > > >
> > > > >
> > > > > >
> > > > > > BR,
> > > > > > -R
> > > > > >
> > > > > > > >
> > > > > > > > > What might be needed is better glue to go from the fd or fdinfo to the
> > > > > > > > > right hw device and then crawl around the hwmon in sysfs automatically. I
> > > > > > > > > would not be surprised at all if we really suck on this, probably more
> > > > > > > > > likely on SoC than pci gpus where at least everything should be under the
> > > > > > > > > main pci sysfs device.
> > > > > > > >
> > > > > > > > yeah, I *think* userspace would have to look at /proc/device-tree to
> > > > > > > > find the cooling device(s) associated with the gpu.. at least I don't
> > > > > > > > see a straightforward way to figure it out just for sysfs
> > > > > > > >
> > > > > > > > BR,
> > > > > > > > -R
> > > > > > > >
> > > > > > > > > -Daniel
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > BR,
> > > > > > > > > > -R
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > [1] https://patchwork.freedesktop.org/series/112397/
> > > > > > > > > > >
> > > > > > > > > > > Rob Clark (2):
> > > > > > > > > > >    drm: Add fdinfo memory stats
> > > > > > > > > > >    drm/msm: Add memory stats to fdinfo
> > > > > > > > > > >
> > > > > > > > > > >   Documentation/gpu/drm-usage-stats.rst | 21 +++++++
> > > > > > > > > > >   drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
> > > > > > > > > > >   drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
> > > > > > > > > > >   drivers/gpu/drm/msm/msm_gpu.c         |  2 -
> > > > > > > > > > >   include/drm/drm_file.h                | 10 ++++
> > > > > > > > > > >   5 files changed, 134 insertions(+), 3 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > 2.39.2
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Daniel Vetter
> > > > > > > > > Software Engineer, Intel Corporation
> > > > > > > > > http://blog.ffwll.ch
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > With best wishes
> > > > > > > Dmitry
> > > > >
> > > > > --
> > > > > With best wishes
> > > > > Dmitry
> > > > >
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
>
>
>
> --
> With best wishes
> Dmitry
Dmitry Baryshkov April 13, 2023, 12:27 a.m. UTC | #15
On 12/04/2023 23:34, Rob Clark wrote:
> On Wed, Apr 12, 2023 at 1:19 PM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
>>
>> On Wed, 12 Apr 2023 at 23:09, Rob Clark <robdclark@gmail.com> wrote:
>>>
>>> On Wed, Apr 12, 2023 at 5:47 AM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>>>>
>>>> On Wed, Apr 12, 2023 at 10:11:32AM +0200, Daniel Vetter wrote:
>>>>> On Wed, Apr 12, 2023 at 01:36:52AM +0300, Dmitry Baryshkov wrote:
>>>>>> On 11/04/2023 21:28, Rob Clark wrote:
>>>>>>> On Tue, Apr 11, 2023 at 10:36 AM Dmitry Baryshkov
>>>>>>> <dmitry.baryshkov@linaro.org> wrote:
>>>>>>>>
>>>>>>>> On Tue, 11 Apr 2023 at 20:13, Rob Clark <robdclark@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Apr 11, 2023 at 9:53 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, Apr 11, 2023 at 09:47:32AM -0700, Rob Clark wrote:
>>>>>>>>>>> On Mon, Apr 10, 2023 at 2:06 PM Rob Clark <robdclark@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> From: Rob Clark <robdclark@chromium.org>
>>>>>>>>>>>>
>>>>>>>>>>>> Similar motivation to other similar recent attempt[1].  But with an
>>>>>>>>>>>> attempt to have some shared code for this.  As well as documentation.
>>>>>>>>>>>>
>>>>>>>>>>>> It is probably a bit UMA-centric, I guess devices with VRAM might want
>>>>>>>>>>>> some placement stats as well.  But this seems like a reasonable start.
>>>>>>>>>>>>
>>>>>>>>>>>> Basic gputop support: https://patchwork.freedesktop.org/series/116236/
>>>>>>>>>>>> And already nvtop support: https://github.com/Syllo/nvtop/pull/204
>>>>>>>>>>>
>>>>>>>>>>> On a related topic, I'm wondering if it would make sense to report
>>>>>>>>>>> some more global things (temp, freq, etc) via fdinfo?  Some of this,
>>>>>>>>>>> tools like nvtop could get by trawling sysfs or other driver specific
>>>>>>>>>>> ways.  But maybe it makes sense to have these sort of things reported
>>>>>>>>>>> in a standardized way (even though they aren't really per-drm_file)
>>>>>>>>>>
>>>>>>>>>> I think that's a bit much layering violation, we'd essentially have to
>>>>>>>>>> reinvent the hwmon sysfs uapi in fdinfo. Not really a business I want to
>>>>>>>>>> be in :-)
>>>>>>>>>
>>>>>>>>> I guess this is true for temp (where there are thermal zones with
>>>>>>>>> potentially multiple temp sensors.. but I'm still digging my way thru
>>>>>>>>> the thermal_cooling_device stuff)
>>>>>>>>
>>>>>>>> It is slightly ugly. All thermal zones and cooling devices are virtual
>>>>>>>> devices (so, even no connection to the particular tsens device). One
>>>>>>>> can either enumerate them by checking
>>>>>>>> /sys/class/thermal/thermal_zoneN/type or enumerate them through
>>>>>>>> /sys/class/hwmon. For cooling devices again the only enumeration is
>>>>>>>> through /sys/class/thermal/cooling_deviceN/type.
>>>>>>>>
>>>>>>>> Probably it should be possible to push cooling devices and thermal
>>>>>>>> zones under corresponding providers. However I do not know if there is
>>>>>>>> a good way to correlate cooling device (ideally a part of GPU) to the
>>>>>>>> thermal_zone (which in our case is provided by tsens / temp_alarm
>>>>>>>> rather than GPU itself).
>>>>>>>>
>>>>>>>>>
>>>>>>>>> But what about freq?  I think, esp for cases where some "fw thing" is
>>>>>>>>> controlling the freq we end up needing to use gpu counters to measure
>>>>>>>>> the freq.
>>>>>>>>
>>>>>>>> For the freq it is slightly easier: /sys/class/devfreq/*, devices are
>>>>>>>> registered under proper parent (IOW, GPU). So one can read
>>>>>>>> /sys/class/devfreq/3d00000.gpu/cur_freq or
>>>>>>>> /sys/bus/platform/devices/3d00000.gpu/devfreq/3d00000.gpu/cur_freq.
>>>>>>>>
>>>>>>>> However because of the components usage, there is no link from
>>>>>>>> /sys/class/drm/card0
>>>>>>>> (/sys/devices/platform/soc@0/ae00000.display-subsystem/ae01000.display-controller/drm/card0)
>>>>>>>> to /sys/devices/platform/soc@0/3d00000.gpu, the GPU unit.
>>>>>>>>
>>>>>>>> Getting all these items together in a platform-independent way would
>>>>>>>> be definitely an important but complex topic.
>>>>>>>
>>>>>>> But I don't believe any of the pci gpu's use devfreq ;-)
>>>>>>>
>>>>>>> And also, you can't expect the CPU to actually know the freq when fw
>>>>>>> is the one controlling freq.  We can, currently, have a reasonable
>>>>>>> approximation from devfreq but that stops if IFPC is implemented.  And
>>>>>>> other GPUs have even less direct control.  So freq is a thing that I
>>>>>>> don't think we should try to get from "common frameworks"
>>>>>>
>>>>>> I think it might be useful to add another passive devfreq governor type for
>>>>>> external frequencies. This way we can use the same interface to export
>>>>>> non-CPU-controlled frequencies.
>>>>>
>>>>> Yeah this sounds like a decent idea to me too. It might also solve the fun
>>>>> of various pci devices having very non-standard freq controls in sysfs
>>>>> (looking at least at i915 here ...)
>>>>
>>>> I also like the idea of having some common infrastructure for the GPU freq.
>>>>
>>>> hwmon have a good infrastructure, but they are more focused on individual
>>>> monitoring devices and not very welcomed to embedded monitoring and control.
>>>> I still want to check the opportunity to see if at least some freq control
>>>> could be aligned there.
>>>>
>>>> Another thing that complicates that is that there are multiple frequency
>>>> domains and controls with multipliers in Intel GPU that are not very
>>>> standard or easy to integrate.
>>>>
>>>> On a quick glace this devfreq seems neat because it aligns with the cpufreq
>>>> and governors. But again it would be hard to align with the multiple domains
>>>> and controls. But it deserves a look.
>>>>
>>>> I will take a look to both fronts for Xe: hwmon and devfreq. Right now on
>>>> Xe we have a lot less controls than i915, but I can imagine soon there
>>>> will be requirements to make that to grow and I fear that we end up just
>>>> like i915. So I will take a look before that happens.
>>>
>>> So it looks like i915 (dgpu only) and nouveau already use hwmon.. so
>>> maybe this is a good way to expose temp.  Maybe we can wire up some
>>> sort of helper for drivers which use thermal_cooling_device (which can
>>> be composed of multiple sensors) to give back an aggregate temp for
>>> hwmon to report?
>>
>> The thermal_device already registers the hwmon, see below. The
>> question is about linking that hwmon to the drm. Strictly speaking, I
>> don't think that we can reexport it in a clean way.
>>
>> # grep gpu /sys/class/hwmon/hwmon*/name
>> /sys/class/hwmon/hwmon15/name:gpu_top_thermal
>> /sys/class/hwmon/hwmon24/name:gpu_bottom_thermal
> 
> I can't get excited about userspace relying on naming conventions or
> other heuristics like this.  

As you can guess, me neither. We are not in 2.4 world anymore.

> Also, userspace's view of the world is
> very much that there is a "gpu card", not a collection of parts.
> (Windows seems to have the same view of the world.)  So we have the
> component framework to assemble the various parts together into the
> "device" that userspace expects to deal with.  We need to do something
> similar for exposing temp and freq.

I think we are lookin for something close to device links. We need to 
create a userspace-visible link from one device to another across device 
hierarchy. Current device_link API is tied to suspend/resume, but the 
overall idea seems to be close enough (in my opinion).

> 
>> # ls /sys/class/hwmon/hwmon15/ -l
>> lrwxrwxrwx    1 root     root             0 Jan 26 08:14 device ->
>> ../../thermal_zone15
>> -r--r--r--    1 root     root          4096 Jan 26 08:14 name
>> drwxr-xr-x    2 root     root             0 Jan 26 08:15 power
>> lrwxrwxrwx    1 root     root             0 Jan 26 08:12 subsystem ->
>> ../../../../../class/hwmon
>> -r--r--r--    1 root     root          4096 Jan 26 08:14 temp1_input
>> -rw-r--r--    1 root     root          4096 Jan 26 08:12 uevent
>>
>>> Freq could possibly be added to hwmon (ie. seems like a reasonable
>>> attribute to add).  Devfreq might also be an option but on arm it
>>> isn't necessarily associated with the drm device, whereas we could
>>> associate the hwmon with the drm device to make it easier for
>>> userspace to find.
>>
>> Possibly we can register a virtual 'passive' devfreq being driven by
>> another active devfreq device.
> 
> That's all fine and good, but it has the same problem that existing
> hwmon's associated with the cooling-device have..
> 
> BR,
> -R
> 
>>>
>>> BR,
>>> -R
>>>
>>>>>
>>>>> I guess it would minimally be a good idea if we could document this, or
>>>>> maybe have a reference implementation in nvtop or whatever the cool thing
>>>>> is rn.
>>>>> -Daniel
>>>>>
>>>>>>
>>>>>>>
>>>>>>> BR,
>>>>>>> -R
>>>>>>>
>>>>>>>>>
>>>>>>>>>> What might be needed is better glue to go from the fd or fdinfo to the
>>>>>>>>>> right hw device and then crawl around the hwmon in sysfs automatically. I
>>>>>>>>>> would not be surprised at all if we really suck on this, probably more
>>>>>>>>>> likely on SoC than pci gpus where at least everything should be under the
>>>>>>>>>> main pci sysfs device.
>>>>>>>>>
>>>>>>>>> yeah, I *think* userspace would have to look at /proc/device-tree to
>>>>>>>>> find the cooling device(s) associated with the gpu.. at least I don't
>>>>>>>>> see a straightforward way to figure it out just for sysfs
>>>>>>>>>
>>>>>>>>> BR,
>>>>>>>>> -R
>>>>>>>>>
>>>>>>>>>> -Daniel
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> BR,
>>>>>>>>>>> -R
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> [1] https://patchwork.freedesktop.org/series/112397/
>>>>>>>>>>>>
>>>>>>>>>>>> Rob Clark (2):
>>>>>>>>>>>>     drm: Add fdinfo memory stats
>>>>>>>>>>>>     drm/msm: Add memory stats to fdinfo
>>>>>>>>>>>>
>>>>>>>>>>>>    Documentation/gpu/drm-usage-stats.rst | 21 +++++++
>>>>>>>>>>>>    drivers/gpu/drm/drm_file.c            | 79 +++++++++++++++++++++++++++
>>>>>>>>>>>>    drivers/gpu/drm/msm/msm_drv.c         | 25 ++++++++-
>>>>>>>>>>>>    drivers/gpu/drm/msm/msm_gpu.c         |  2 -
>>>>>>>>>>>>    include/drm/drm_file.h                | 10 ++++
>>>>>>>>>>>>    5 files changed, 134 insertions(+), 3 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> 2.39.2
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Daniel Vetter
>>>>>>>>>> Software Engineer, Intel Corporation
>>>>>>>>>> http://blog.ffwll.ch
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> With best wishes
>>>>>>>> Dmitry
>>>>>>
>>>>>> --
>>>>>> With best wishes
>>>>>> Dmitry
>>>>>>
>>>>>
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> http://blog.ffwll.ch
>>
>>
>>
>> --
>> With best wishes
>> Dmitry