diff mbox

[v11,5/6] drm/i915: add query uAPI

Message ID 20180122082149.9847-6-lionel.g.landwerlin@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Lionel Landwerlin Jan. 22, 2018, 8:21 a.m. UTC
There are a number of information that are readable from hardware
registers and that we would like to make accessible to userspace. One
particular example is the topology of the execution units (how are
execution units grouped in subslices and slices and also which ones
have been fused off for die recovery).

At the moment the GET_PARAM ioctl covers some basic needs, but
generally is only able to return a single value for each defined
parameter. This is a bit problematic with topology descriptions which
are array/maps of available units.

This change introduces a new ioctl that can deal with requests to fill
structures of potentially variable lengths. The user is expected fill
a query with length fields set at 0 on the first call, the kernel then
sets the length fields to the their expected values. A second call to
the kernel with length fields at their expected values will trigger a
copy of the data to the pointed memory locations.

The scope of this uAPI is only to provide information to userspace,
not to allow configuration of the device.

v2: Simplify dispatcher code iteration (Tvrtko)
    Tweak uapi drm_i915_query_item structure (Tvrtko)

v3: Rename pad fields into flags (Chris)
    Return error on flags field != 0 (Chris)
    Only copy length back to userspace in drm_i915_query_item (Chris)

v4: Use array of functions instead of switch (Chris)

v5: More comments in uapi (Tvrtko)
    Return query item errors in length field (All)

v6: Tweak uapi comments style to match the coding style (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/Makefile     |  1 +
 drivers/gpu/drm/i915/i915_drv.c   |  1 +
 drivers/gpu/drm/i915/i915_drv.h   |  3 ++
 drivers/gpu/drm/i915/i915_query.c | 67 +++++++++++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h       | 41 ++++++++++++++++++++++++
 5 files changed, 113 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_query.c

Comments

Lionel Landwerlin Jan. 23, 2018, 2:17 p.m. UTC | #1
Hi all,

I've been trying to expose some information to userspace about the fused 
parts of the GPU.
This is the 4th attempt at getting this upstream, here are the previous 
ones :
     https://patchwork.freedesktop.org/patch/185959/
     https://patchwork.freedesktop.org/series/33436/
     https://patchwork.freedesktop.org/series/33950/

This last iteration was based upon some direction by Daniel : 
https://lists.freedesktop.org/archives/dri-devel/2017-December/160162.html
There was, I think, a fair point about having this working in 
environments where sysfs (mechanism used in the 3rd iteration) is not 
available (containers).

Following some discussion on IRC, it seems Joonas would like this 
rewritten in a such way that we essentially drop the generic mechanism 
introduced in this patch, and instead go for an additional ioctl() on 
the drm fd just for querying the state of a fused part of 
slice/subslice/eus.
The proposal is to have a single struct like :

struct drm_i915_topology {
    /* All field are in/out */
    int slice;
    int subslice;
    int eu;

    int enabled;
};

You would let the slice field to -1 and then the kernel would fill it 
with the max slice value. Same for subslice (with a valid slice value) 
and eu.
When querying with slice = 0, and all other fields to -1, the kernel 
would fill the enabled value with 0 or 1.
Essentially that would mean that an application wanting to query the 
state of all of the EUs would have to go through them one by one (which 
would be about ~100 ioctl() on SKL GT4 for example).

Apart from the fact that we'll probably end up adding another ioctl() 
for engine discovery, I don't have any problem with what Joonas is 
proposing.
It's just a bit annoying this comes up on the 4th rewrite.
I really wouldn't like to rewrite this one more time and get turned down 
because this isn't to the taste of one of the reviewer.
So my question is : Is everybody happy with what Joonas is proposing?
Anybody in favor of having a generic mechanism?

Thanks a lot,

-
Lionel



On 22/01/18 08:21, Lionel Landwerlin wrote:
> There are a number of information that are readable from hardware
> registers and that we would like to make accessible to userspace. One
> particular example is the topology of the execution units (how are
> execution units grouped in subslices and slices and also which ones
> have been fused off for die recovery).
>
> At the moment the GET_PARAM ioctl covers some basic needs, but
> generally is only able to return a single value for each defined
> parameter. This is a bit problematic with topology descriptions which
> are array/maps of available units.
>
> This change introduces a new ioctl that can deal with requests to fill
> structures of potentially variable lengths. The user is expected fill
> a query with length fields set at 0 on the first call, the kernel then
> sets the length fields to the their expected values. A second call to
> the kernel with length fields at their expected values will trigger a
> copy of the data to the pointed memory locations.
>
> The scope of this uAPI is only to provide information to userspace,
> not to allow configuration of the device.
>
> v2: Simplify dispatcher code iteration (Tvrtko)
>      Tweak uapi drm_i915_query_item structure (Tvrtko)
>
> v3: Rename pad fields into flags (Chris)
>      Return error on flags field != 0 (Chris)
>      Only copy length back to userspace in drm_i915_query_item (Chris)
>
> v4: Use array of functions instead of switch (Chris)
>
> v5: More comments in uapi (Tvrtko)
>      Return query item errors in length field (All)
>
> v6: Tweak uapi comments style to match the coding style (Lionel)
>
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile     |  1 +
>   drivers/gpu/drm/i915/i915_drv.c   |  1 +
>   drivers/gpu/drm/i915/i915_drv.h   |  3 ++
>   drivers/gpu/drm/i915/i915_query.c | 67 +++++++++++++++++++++++++++++++++++++++
>   include/uapi/drm/i915_drm.h       | 41 ++++++++++++++++++++++++
>   5 files changed, 113 insertions(+)
>   create mode 100644 drivers/gpu/drm/i915/i915_query.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 3bddd8a06806..b0415a3e2d59 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -69,6 +69,7 @@ i915-y += i915_cmd_parser.o \
>   	  i915_gem_timeline.o \
>   	  i915_gem_userptr.o \
>   	  i915_gemfs.o \
> +	  i915_query.o \
>   	  i915_trace_points.o \
>   	  i915_vma.o \
>   	  intel_breadcrumbs.o \
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 6d34bc1a7fff..c59be22f19ab 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -2830,6 +2830,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>   	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>   };
>   
>   static struct drm_driver driver = {
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8333692dac5a..50a7040af71e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3627,6 +3627,9 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
>   extern void i915_perf_register(struct drm_i915_private *dev_priv);
>   extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
>   
> +/* i915_query.c */
> +int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
> +
>   /* i915_suspend.c */
>   extern int i915_save_state(struct drm_i915_private *dev_priv);
>   extern int i915_restore_state(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> new file mode 100644
> index 000000000000..5aa886313cf9
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -0,0 +1,67 @@
> +/*
> + * Copyright © 2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include "i915_drv.h"
> +#include <uapi/drm/i915_drm.h>
> +
> +static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
> +					struct drm_i915_query_item *query_item) = {
> +};
> +
> +int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> +{
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct drm_i915_query *args = data;
> +	struct drm_i915_query_item __user *user_item_ptr =
> +		u64_to_user_ptr(args->items_ptr);
> +	u32 i;
> +
> +	if (args->flags != 0)
> +		return -EINVAL;
> +
> +	for (i = 0; i < args->num_items; i++, user_item_ptr++) {
> +		struct drm_i915_query_item item;
> +		u64 func_idx;
> +		int ret;
> +
> +		if (copy_from_user(&item, user_item_ptr, sizeof(item)))
> +			return -EFAULT;
> +
> +		if (item.query_id == 0)
> +			return -EINVAL;
> +
> +		func_idx = item.query_id - 1;
> +
> +		if (func_idx >= ARRAY_SIZE(i915_query_funcs))
> +			return -EINVAL;
> +
> +		ret = i915_query_funcs[func_idx](dev_priv, &item);
> +
> +		/* Only write the length back to userspace if they differ. */
> +		if (ret != item.length && put_user(ret, &user_item_ptr->length))
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 536ee4febd74..03f031652d86 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -318,6 +318,7 @@ typedef struct _drm_i915_sarea {
>   #define DRM_I915_PERF_OPEN		0x36
>   #define DRM_I915_PERF_ADD_CONFIG	0x37
>   #define DRM_I915_PERF_REMOVE_CONFIG	0x38
> +#define DRM_I915_QUERY			0x39
>   
>   #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
>   #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
> @@ -375,6 +376,7 @@ typedef struct _drm_i915_sarea {
>   #define DRM_IOCTL_I915_PERF_OPEN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
>   #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
>   #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
> +#define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>   
>   /* Allow drivers to submit batchbuffers directly to hardware, relying
>    * on the security mechanisms provided by hardware.
> @@ -1613,6 +1615,45 @@ struct drm_i915_perf_oa_config {
>   	__u64 flex_regs_ptr;
>   };
>   
> +
> +struct drm_i915_query_item {
> +	__u64 query_id;
> +
> +	/*
> +	 * When set to zero by userspace, this is filled with the size of the
> +	 * data to be written at the data_ptr pointer. The kernel set this
> +	 * value to a negative value to signal an error on a particular query
> +	 * item.
> +	 */
> +	__s32 length;
> +
> +	/*
> +	 * Unused for now.
> +	 */
> +	__u32 flags;
> +
> +	/*
> +	 * Data will be written at the location pointed by data_ptr when the
> +	 * value of length matches the length of the data to be written by the
> +	 * kernel.
> +	 */
> +	__u64 data_ptr;
> +};
> +
> +struct drm_i915_query {
> +	__u32 num_items;
> +
> +	/*
> +	 * Unused for now.
> +	 */
> +	__u32 flags;
> +
> +	/*
> +	 * This point to an array of num_items drm_i915_query_item structures.
> +	 */
> +	__u64 items_ptr;
> +};
> +
>   #if defined(__cplusplus)
>   }
>   #endif
Tvrtko Ursulin Jan. 24, 2018, 12:03 p.m. UTC | #2
On 23/01/2018 14:17, Lionel Landwerlin wrote:
> Hi all,
> 
> I've been trying to expose some information to userspace about the fused 
> parts of the GPU.
> This is the 4th attempt at getting this upstream, here are the previous 
> ones :
>      https://patchwork.freedesktop.org/patch/185959/
>      https://patchwork.freedesktop.org/series/33436/
>      https://patchwork.freedesktop.org/series/33950/
> 
> This last iteration was based upon some direction by Daniel : 
> https://lists.freedesktop.org/archives/dri-devel/2017-December/160162.html
> There was, I think, a fair point about having this working in 
> environments where sysfs (mechanism used in the 3rd iteration) is not 
> available (containers).
> 
> Following some discussion on IRC, it seems Joonas would like this 
> rewritten in a such way that we essentially drop the generic mechanism 
> introduced in this patch, and instead go for an additional ioctl() on 
> the drm fd just for querying the state of a fused part of 
> slice/subslice/eus.
> The proposal is to have a single struct like :
> 
> struct drm_i915_topology {
>     /* All field are in/out */
>     int slice;
>     int subslice;
>     int eu;
> 
>     int enabled;
> };
> 
> You would let the slice field to -1 and then the kernel would fill it 
> with the max slice value. Same for subslice (with a valid slice value) 
> and eu.
> When querying with slice = 0, and all other fields to -1, the kernel 
> would fill the enabled value with 0 or 1.
> Essentially that would mean that an application wanting to query the 
> state of all of the EUs would have to go through them one by one (which 
> would be about ~100 ioctl() on SKL GT4 for example).
> 
> Apart from the fact that we'll probably end up adding another ioctl() 
> for engine discovery, I don't have any problem with what Joonas is 
> proposing.
> It's just a bit annoying this comes up on the 4th rewrite.
> I really wouldn't like to rewrite this one more time and get turned down 
> because this isn't to the taste of one of the reviewer.
> So my question is : Is everybody happy with what Joonas is proposing?
> Anybody in favor of having a generic mechanism?

I am not very keen on this counter-proposal for two reasons.

First is that I think is a bit inelegant to have to query so many times 
just to get the full topology out. If this ends in some library, we may 
end up running this on every trivial client startup.

Secondly, it is kind of dispatcher in it's own right. Since the 
operation mode will depend on the combination of field values. As such a 
generic, or at least a more explicit, dispatcher, like the proposed 
i915_query_ioctl sounds cleaner to me.

I take the point we can't guess how many other users we will have for it 
in the future. So there is a little bit of a risk of adding something 
generic which won't be used a lot in the future.

Because apart from the three queries Lionel needs, I would be adding an 
engine info query and potentially, depending on userspace interest, 
engine queue depths. So that would be a maximum of five queries I am 
aware of would use the generic framework. Maybe too little, or maybe 
good enough for a start?

Alternative would be to go with dedicated ioctls for all of these. It 
would work for my needs, but I am not sure if Lionel can massage the 
GPU/EU topology into one, or at least fewer than three, nicer ioctls 
(Not the kind which needs to be called hundred times to get the complete 
state.).

Regards,

Tvrtko

> 
> On 22/01/18 08:21, Lionel Landwerlin wrote:
>> There are a number of information that are readable from hardware
>> registers and that we would like to make accessible to userspace. One
>> particular example is the topology of the execution units (how are
>> execution units grouped in subslices and slices and also which ones
>> have been fused off for die recovery).
>>
>> At the moment the GET_PARAM ioctl covers some basic needs, but
>> generally is only able to return a single value for each defined
>> parameter. This is a bit problematic with topology descriptions which
>> are array/maps of available units.
>>
>> This change introduces a new ioctl that can deal with requests to fill
>> structures of potentially variable lengths. The user is expected fill
>> a query with length fields set at 0 on the first call, the kernel then
>> sets the length fields to the their expected values. A second call to
>> the kernel with length fields at their expected values will trigger a
>> copy of the data to the pointed memory locations.
>>
>> The scope of this uAPI is only to provide information to userspace,
>> not to allow configuration of the device.
>>
>> v2: Simplify dispatcher code iteration (Tvrtko)
>>      Tweak uapi drm_i915_query_item structure (Tvrtko)
>>
>> v3: Rename pad fields into flags (Chris)
>>      Return error on flags field != 0 (Chris)
>>      Only copy length back to userspace in drm_i915_query_item (Chris)
>>
>> v4: Use array of functions instead of switch (Chris)
>>
>> v5: More comments in uapi (Tvrtko)
>>      Return query item errors in length field (All)
>>
>> v6: Tweak uapi comments style to match the coding style (Lionel)
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/Makefile     |  1 +
>>   drivers/gpu/drm/i915/i915_drv.c   |  1 +
>>   drivers/gpu/drm/i915/i915_drv.h   |  3 ++
>>   drivers/gpu/drm/i915/i915_query.c | 67 
>> +++++++++++++++++++++++++++++++++++++++
>>   include/uapi/drm/i915_drm.h       | 41 ++++++++++++++++++++++++
>>   5 files changed, 113 insertions(+)
>>   create mode 100644 drivers/gpu/drm/i915/i915_query.c
>>
>> diff --git a/drivers/gpu/drm/i915/Makefile 
>> b/drivers/gpu/drm/i915/Makefile
>> index 3bddd8a06806..b0415a3e2d59 100644
>> --- a/drivers/gpu/drm/i915/Makefile
>> +++ b/drivers/gpu/drm/i915/Makefile
>> @@ -69,6 +69,7 @@ i915-y += i915_cmd_parser.o \
>>         i915_gem_timeline.o \
>>         i915_gem_userptr.o \
>>         i915_gemfs.o \
>> +      i915_query.o \
>>         i915_trace_points.o \
>>         i915_vma.o \
>>         intel_breadcrumbs.o \
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c 
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index 6d34bc1a7fff..c59be22f19ab 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -2830,6 +2830,7 @@ static const struct drm_ioctl_desc i915_ioctls[] 
>> = {
>>       DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, 
>> DRM_RENDER_ALLOW),
>>       DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, 
>> i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>>       DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, 
>> i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
>> +    DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, 
>> DRM_UNLOCKED|DRM_RENDER_ALLOW),
>>   };
>>   static struct drm_driver driver = {
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 8333692dac5a..50a7040af71e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -3627,6 +3627,9 @@ extern void i915_perf_fini(struct 
>> drm_i915_private *dev_priv);
>>   extern void i915_perf_register(struct drm_i915_private *dev_priv);
>>   extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
>> +/* i915_query.c */
>> +int i915_query_ioctl(struct drm_device *dev, void *data, struct 
>> drm_file *file);
>> +
>>   /* i915_suspend.c */
>>   extern int i915_save_state(struct drm_i915_private *dev_priv);
>>   extern int i915_restore_state(struct drm_i915_private *dev_priv);
>> diff --git a/drivers/gpu/drm/i915/i915_query.c 
>> b/drivers/gpu/drm/i915/i915_query.c
>> new file mode 100644
>> index 000000000000..5aa886313cf9
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/i915_query.c
>> @@ -0,0 +1,67 @@
>> +/*
>> + * Copyright © 2017 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person 
>> obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without 
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, 
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including 
>> the next
>> + * paragraph) shall be included in all copies or substantial portions 
>> of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES 
>> OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
>> ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 
>> OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#include "i915_drv.h"
>> +#include <uapi/drm/i915_drm.h>
>> +
>> +static int (* const i915_query_funcs[])(struct drm_i915_private 
>> *dev_priv,
>> +                    struct drm_i915_query_item *query_item) = {
>> +};
>> +
>> +int i915_query_ioctl(struct drm_device *dev, void *data, struct 
>> drm_file *file)
>> +{
>> +    struct drm_i915_private *dev_priv = to_i915(dev);
>> +    struct drm_i915_query *args = data;
>> +    struct drm_i915_query_item __user *user_item_ptr =
>> +        u64_to_user_ptr(args->items_ptr);
>> +    u32 i;
>> +
>> +    if (args->flags != 0)
>> +        return -EINVAL;
>> +
>> +    for (i = 0; i < args->num_items; i++, user_item_ptr++) {
>> +        struct drm_i915_query_item item;
>> +        u64 func_idx;
>> +        int ret;
>> +
>> +        if (copy_from_user(&item, user_item_ptr, sizeof(item)))
>> +            return -EFAULT;
>> +
>> +        if (item.query_id == 0)
>> +            return -EINVAL;
>> +
>> +        func_idx = item.query_id - 1;
>> +
>> +        if (func_idx >= ARRAY_SIZE(i915_query_funcs))
>> +            return -EINVAL;
>> +
>> +        ret = i915_query_funcs[func_idx](dev_priv, &item);
>> +
>> +        /* Only write the length back to userspace if they differ. */
>> +        if (ret != item.length && put_user(ret, &user_item_ptr->length))
>> +            return -EFAULT;
>> +    }
>> +
>> +    return 0;
>> +}
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 536ee4febd74..03f031652d86 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -318,6 +318,7 @@ typedef struct _drm_i915_sarea {
>>   #define DRM_I915_PERF_OPEN        0x36
>>   #define DRM_I915_PERF_ADD_CONFIG    0x37
>>   #define DRM_I915_PERF_REMOVE_CONFIG    0x38
>> +#define DRM_I915_QUERY            0x39
>>   #define DRM_IOCTL_I915_INIT        DRM_IOW( DRM_COMMAND_BASE + 
>> DRM_I915_INIT, drm_i915_init_t)
>>   #define DRM_IOCTL_I915_FLUSH        DRM_IO ( DRM_COMMAND_BASE + 
>> DRM_I915_FLUSH)
>> @@ -375,6 +376,7 @@ typedef struct _drm_i915_sarea {
>>   #define DRM_IOCTL_I915_PERF_OPEN    DRM_IOW(DRM_COMMAND_BASE + 
>> DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
>>   #define DRM_IOCTL_I915_PERF_ADD_CONFIG    DRM_IOW(DRM_COMMAND_BASE + 
>> DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
>>   #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG    
>> DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
>> +#define DRM_IOCTL_I915_QUERY            DRM_IOWR(DRM_COMMAND_BASE + 
>> DRM_I915_QUERY, struct drm_i915_query)
>>   /* Allow drivers to submit batchbuffers directly to hardware, relying
>>    * on the security mechanisms provided by hardware.
>> @@ -1613,6 +1615,45 @@ struct drm_i915_perf_oa_config {
>>       __u64 flex_regs_ptr;
>>   };
>> +
>> +struct drm_i915_query_item {
>> +    __u64 query_id;
>> +
>> +    /*
>> +     * When set to zero by userspace, this is filled with the size of 
>> the
>> +     * data to be written at the data_ptr pointer. The kernel set this
>> +     * value to a negative value to signal an error on a particular 
>> query
>> +     * item.
>> +     */
>> +    __s32 length;
>> +
>> +    /*
>> +     * Unused for now.
>> +     */
>> +    __u32 flags;
>> +
>> +    /*
>> +     * Data will be written at the location pointed by data_ptr when the
>> +     * value of length matches the length of the data to be written 
>> by the
>> +     * kernel.
>> +     */
>> +    __u64 data_ptr;
>> +};
>> +
>> +struct drm_i915_query {
>> +    __u32 num_items;
>> +
>> +    /*
>> +     * Unused for now.
>> +     */
>> +    __u32 flags;
>> +
>> +    /*
>> +     * This point to an array of num_items drm_i915_query_item 
>> structures.
>> +     */
>> +    __u64 items_ptr;
>> +};
>> +
>>   #if defined(__cplusplus)
>>   }
>>   #endif
> 
> 
>
Chris Wilson Jan. 24, 2018, 3:14 p.m. UTC | #3
Quoting Tvrtko Ursulin (2018-01-24 12:03:46)
> 
> On 23/01/2018 14:17, Lionel Landwerlin wrote:
> > Hi all,
> > 
> > I've been trying to expose some information to userspace about the fused 
> > parts of the GPU.
> > This is the 4th attempt at getting this upstream, here are the previous 
> > ones :
> >      https://patchwork.freedesktop.org/patch/185959/
> >      https://patchwork.freedesktop.org/series/33436/
> >      https://patchwork.freedesktop.org/series/33950/
> > 
> > This last iteration was based upon some direction by Daniel : 
> > https://lists.freedesktop.org/archives/dri-devel/2017-December/160162.html
> > There was, I think, a fair point about having this working in 
> > environments where sysfs (mechanism used in the 3rd iteration) is not 
> > available (containers).
> > 
> > Following some discussion on IRC, it seems Joonas would like this 
> > rewritten in a such way that we essentially drop the generic mechanism 
> > introduced in this patch, and instead go for an additional ioctl() on 
> > the drm fd just for querying the state of a fused part of 
> > slice/subslice/eus.
> > The proposal is to have a single struct like :
> > 
> > struct drm_i915_topology {
> >     /* All field are in/out */
> >     int slice;
> >     int subslice;
> >     int eu;
> > 
> >     int enabled;
> > };
> > 
> > You would let the slice field to -1 and then the kernel would fill it 
> > with the max slice value. Same for subslice (with a valid slice value) 
> > and eu.
> > When querying with slice = 0, and all other fields to -1, the kernel 
> > would fill the enabled value with 0 or 1.
> > Essentially that would mean that an application wanting to query the 
> > state of all of the EUs would have to go through them one by one (which 
> > would be about ~100 ioctl() on SKL GT4 for example).
> > 
> > Apart from the fact that we'll probably end up adding another ioctl() 
> > for engine discovery, I don't have any problem with what Joonas is 
> > proposing.
> > It's just a bit annoying this comes up on the 4th rewrite.
> > I really wouldn't like to rewrite this one more time and get turned down 
> > because this isn't to the taste of one of the reviewer.
> > So my question is : Is everybody happy with what Joonas is proposing?
> > Anybody in favor of having a generic mechanism?
> 
> I am not very keen on this counter-proposal for two reasons.
> 
> First is that I think is a bit inelegant to have to query so many times 
> just to get the full topology out. If this ends in some library, we may 
> end up running this on every trivial client startup.
> 
> Secondly, it is kind of dispatcher in it's own right. Since the 
> operation mode will depend on the combination of field values. As such a 
> generic, or at least a more explicit, dispatcher, like the proposed 
> i915_query_ioctl sounds cleaner to me.
> 
> I take the point we can't guess how many other users we will have for it 
> in the future. So there is a little bit of a risk of adding something 
> generic which won't be used a lot in the future.
> 
> Because apart from the three queries Lionel needs, I would be adding an 
> engine info query and potentially, depending on userspace interest, 
> engine queue depths. So that would be a maximum of five queries I am 
> aware of would use the generic framework. Maybe too little, or maybe 
> good enough for a start?

Another use case would be a single shot method to gather all GETPARAMs.

There's a lot of too'ing and fro'ing at the start of mesa trying to
determine all of the kernel's capabilities, which more or less come down
to a long series of parsing GETPARAM results. Bundling all of those up
into a single ioctl seems attractive to me (bonus for it being properly
defined and not a compat nightmare).
-Chris
Lionel Landwerlin Jan. 29, 2018, 4:24 p.m. UTC | #4
On 24/01/18 15:14, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-24 12:03:46)
>> On 23/01/2018 14:17, Lionel Landwerlin wrote:
>>> Hi all,
>>>
>>> I've been trying to expose some information to userspace about the fused
>>> parts of the GPU.
>>> This is the 4th attempt at getting this upstream, here are the previous
>>> ones :
>>>       https://patchwork.freedesktop.org/patch/185959/
>>>       https://patchwork.freedesktop.org/series/33436/
>>>       https://patchwork.freedesktop.org/series/33950/
>>>
>>> This last iteration was based upon some direction by Daniel :
>>> https://lists.freedesktop.org/archives/dri-devel/2017-December/160162.html
>>> There was, I think, a fair point about having this working in
>>> environments where sysfs (mechanism used in the 3rd iteration) is not
>>> available (containers).
>>>
>>> Following some discussion on IRC, it seems Joonas would like this
>>> rewritten in a such way that we essentially drop the generic mechanism
>>> introduced in this patch, and instead go for an additional ioctl() on
>>> the drm fd just for querying the state of a fused part of
>>> slice/subslice/eus.
>>> The proposal is to have a single struct like :
>>>
>>> struct drm_i915_topology {
>>>      /* All field are in/out */
>>>      int slice;
>>>      int subslice;
>>>      int eu;
>>>
>>>      int enabled;
>>> };
>>>
>>> You would let the slice field to -1 and then the kernel would fill it
>>> with the max slice value. Same for subslice (with a valid slice value)
>>> and eu.
>>> When querying with slice = 0, and all other fields to -1, the kernel
>>> would fill the enabled value with 0 or 1.
>>> Essentially that would mean that an application wanting to query the
>>> state of all of the EUs would have to go through them one by one (which
>>> would be about ~100 ioctl() on SKL GT4 for example).
>>>
>>> Apart from the fact that we'll probably end up adding another ioctl()
>>> for engine discovery, I don't have any problem with what Joonas is
>>> proposing.
>>> It's just a bit annoying this comes up on the 4th rewrite.
>>> I really wouldn't like to rewrite this one more time and get turned down
>>> because this isn't to the taste of one of the reviewer.
>>> So my question is : Is everybody happy with what Joonas is proposing?
>>> Anybody in favor of having a generic mechanism?
>> I am not very keen on this counter-proposal for two reasons.
>>
>> First is that I think is a bit inelegant to have to query so many times
>> just to get the full topology out. If this ends in some library, we may
>> end up running this on every trivial client startup.
>>
>> Secondly, it is kind of dispatcher in it's own right. Since the
>> operation mode will depend on the combination of field values. As such a
>> generic, or at least a more explicit, dispatcher, like the proposed
>> i915_query_ioctl sounds cleaner to me.
>>
>> I take the point we can't guess how many other users we will have for it
>> in the future. So there is a little bit of a risk of adding something
>> generic which won't be used a lot in the future.
>>
>> Because apart from the three queries Lionel needs, I would be adding an
>> engine info query and potentially, depending on userspace interest,
>> engine queue depths. So that would be a maximum of five queries I am
>> aware of would use the generic framework. Maybe too little, or maybe
>> good enough for a start?
> Another use case would be a single shot method to gather all GETPARAMs.
>
> There's a lot of too'ing and fro'ing at the start of mesa trying to
> determine all of the kernel's capabilities, which more or less come down
> to a long series of parsing GETPARAM results. Bundling all of those up
> into a single ioctl seems attractive to me (bonus for it being properly
> defined and not a compat nightmare).
> -Chris
>
Thanks all,

I don't really read much opposition to the current patch series. If 
anything we could actually want to do more it seems.
It would be good to have the green light and land that.
I've played quickly with a Chris' idea and will add a couple of RFC 
patches for discussion.

Cheers,

-
Lionel
Lahtinen, Joonas Jan. 30, 2018, 3:01 p.m. UTC | #5
On Mon, 2018-01-29 at 16:24 +0000, Lionel Landwerlin wrote:
>

> Thanks all,

> 

> I don't really read much opposition to the current patch series. If 

> anything we could actually want to do more it seems.

> It would be good to have the green light and land that.

> I've played quickly with a Chris' idea and will add a couple of RFC 

> patches for discussion.


Where is the latest iteration of the less generic IOCTL?

I'm still not sold on this, the resulting uAPI is quite horrendous with
the macros to interpret the results.

Regards, Joonas
---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Lionel Landwerlin Jan. 30, 2018, 3:50 p.m. UTC | #6
On 30/01/18 15:01, Lahtinen, Joonas wrote:
> On Mon, 2018-01-29 at 16:24 +0000, Lionel Landwerlin wrote:
>> Thanks all,
>>
>> I don't really read much opposition to the current patch series. If
>> anything we could actually want to do more it seems.
>> It would be good to have the green light and land that.
>> I've played quickly with a Chris' idea and will add a couple of RFC
>> patches for discussion.
> Where is the latest iteration of the less generic IOCTL?
>
> I'm still not sold on this, the resulting uAPI is quite horrendous with
> the macros to interpret the results.
>
> Regards, Joonas
Not sure what is the less generic one. Here are the past attempts :

https://patchwork.freedesktop.org/patch/185959/
https://patchwork.freedesktop.org/series/33436/
https://patchwork.freedesktop.org/series/33950/

One of them was based off Tvrtko's stab at exposing the engines.

-
Lionel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3bddd8a06806..b0415a3e2d59 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -69,6 +69,7 @@  i915-y += i915_cmd_parser.o \
 	  i915_gem_timeline.o \
 	  i915_gem_userptr.o \
 	  i915_gemfs.o \
+	  i915_query.o \
 	  i915_trace_points.o \
 	  i915_vma.o \
 	  intel_breadcrumbs.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6d34bc1a7fff..c59be22f19ab 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2830,6 +2830,7 @@  static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8333692dac5a..50a7040af71e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3627,6 +3627,9 @@  extern void i915_perf_fini(struct drm_i915_private *dev_priv);
 extern void i915_perf_register(struct drm_i915_private *dev_priv);
 extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
 
+/* i915_query.c */
+int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
+
 /* i915_suspend.c */
 extern int i915_save_state(struct drm_i915_private *dev_priv);
 extern int i915_restore_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
new file mode 100644
index 000000000000..5aa886313cf9
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -0,0 +1,67 @@ 
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include <uapi/drm/i915_drm.h>
+
+static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
+					struct drm_i915_query_item *query_item) = {
+};
+
+int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct drm_i915_query *args = data;
+	struct drm_i915_query_item __user *user_item_ptr =
+		u64_to_user_ptr(args->items_ptr);
+	u32 i;
+
+	if (args->flags != 0)
+		return -EINVAL;
+
+	for (i = 0; i < args->num_items; i++, user_item_ptr++) {
+		struct drm_i915_query_item item;
+		u64 func_idx;
+		int ret;
+
+		if (copy_from_user(&item, user_item_ptr, sizeof(item)))
+			return -EFAULT;
+
+		if (item.query_id == 0)
+			return -EINVAL;
+
+		func_idx = item.query_id - 1;
+
+		if (func_idx >= ARRAY_SIZE(i915_query_funcs))
+			return -EINVAL;
+
+		ret = i915_query_funcs[func_idx](dev_priv, &item);
+
+		/* Only write the length back to userspace if they differ. */
+		if (ret != item.length && put_user(ret, &user_item_ptr->length))
+			return -EFAULT;
+	}
+
+	return 0;
+}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 536ee4febd74..03f031652d86 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -318,6 +318,7 @@  typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_OPEN		0x36
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
+#define DRM_I915_QUERY			0x39
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -375,6 +376,7 @@  typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_OPEN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
+#define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1613,6 +1615,45 @@  struct drm_i915_perf_oa_config {
 	__u64 flex_regs_ptr;
 };
 
+
+struct drm_i915_query_item {
+	__u64 query_id;
+
+	/*
+	 * When set to zero by userspace, this is filled with the size of the
+	 * data to be written at the data_ptr pointer. The kernel set this
+	 * value to a negative value to signal an error on a particular query
+	 * item.
+	 */
+	__s32 length;
+
+	/*
+	 * Unused for now.
+	 */
+	__u32 flags;
+
+	/*
+	 * Data will be written at the location pointed by data_ptr when the
+	 * value of length matches the length of the data to be written by the
+	 * kernel.
+	 */
+	__u64 data_ptr;
+};
+
+struct drm_i915_query {
+	__u32 num_items;
+
+	/*
+	 * Unused for now.
+	 */
+	__u32 flags;
+
+	/*
+	 * This point to an array of num_items drm_i915_query_item structures.
+	 */
+	__u64 items_ptr;
+};
+
 #if defined(__cplusplus)
 }
 #endif