diff mbox series

[27/38] drm/i915: Introduce the i915_user_extension_method

Message ID 20190118140109.25261-28-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [01/38] drm/i915/execlists: Store the highest priority context | expand

Commit Message

Chris Wilson Jan. 18, 2019, 2 p.m. UTC
An idea for extending uABI inspired by Vulkan's extension chains.
Instead of expanding the data struct for each ioctl every time we need
to add a new feature, define an extension chain instead. As we add
optional interfaces to control the ioctl, we define a new extension
struct that can be linked into the ioctl data only when required by the
user. The key advantage being able to ignore large control structs for
optional interfaces/extensions, while being able to process them in a
consistent manner.

In comparison to other extensible ioctls, the key difference is the
use of a linked chain of extension structs vs an array of tagged
pointers. For example,

struct drm_amdgpu_cs_chunk {
        __u32           chunk_id;
        __u32           length_dw;
        __u64           chunk_data;
};

struct drm_amdgpu_cs_in {
        __u32           ctx_id;
        __u32           bo_list_handle;
        __u32           num_chunks;
        __u32           _pad;
        __u64           chunks;
};

allows userspace to pass in array of pointers to extension structs, but
must therefore keep constructing that array along side the command stream.
In dynamic situations like that, a linked list is preferred and does not
similar from extra cache line misses as the extension structs themselves
must still be loaded separate to the chunks array.

v2: Apply the tail call optimisation directly to nip the worry of stack
overflow in the bud.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/i915_user_extensions.c | 42 +++++++++++++++++++++
 drivers/gpu/drm/i915/i915_user_extensions.h | 20 ++++++++++
 include/uapi/drm/i915_drm.h                 | 20 ++++++++++
 4 files changed, 83 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_user_extensions.c
 create mode 100644 drivers/gpu/drm/i915/i915_user_extensions.h

Comments

Tvrtko Ursulin Jan. 22, 2019, 9:31 a.m. UTC | #1
On 18/01/2019 14:00, Chris Wilson wrote:
> An idea for extending uABI inspired by Vulkan's extension chains.
> Instead of expanding the data struct for each ioctl every time we need
> to add a new feature, define an extension chain instead. As we add
> optional interfaces to control the ioctl, we define a new extension
> struct that can be linked into the ioctl data only when required by the
> user. The key advantage being able to ignore large control structs for
> optional interfaces/extensions, while being able to process them in a
> consistent manner.
> 
> In comparison to other extensible ioctls, the key difference is the
> use of a linked chain of extension structs vs an array of tagged
> pointers. For example,
> 
> struct drm_amdgpu_cs_chunk {
>          __u32           chunk_id;
>          __u32           length_dw;
>          __u64           chunk_data;
> };
> 
> struct drm_amdgpu_cs_in {
>          __u32           ctx_id;
>          __u32           bo_list_handle;
>          __u32           num_chunks;
>          __u32           _pad;
>          __u64           chunks;
> };
> 
> allows userspace to pass in array of pointers to extension structs, but
> must therefore keep constructing that array along side the command stream.
> In dynamic situations like that, a linked list is preferred and does not
> similar from extra cache line misses as the extension structs themselves

s/similar/suffer/ I think.

> must still be loaded separate to the chunks array.
> 
> v2: Apply the tail call optimisation directly to nip the worry of stack
> overflow in the bud.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/Makefile               |  1 +
>   drivers/gpu/drm/i915/i915_user_extensions.c | 42 +++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_user_extensions.h | 20 ++++++++++
>   include/uapi/drm/i915_drm.h                 | 20 ++++++++++
>   4 files changed, 83 insertions(+)
>   create mode 100644 drivers/gpu/drm/i915/i915_user_extensions.c
>   create mode 100644 drivers/gpu/drm/i915/i915_user_extensions.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 611115ed00db..f206fbc85cee 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -45,6 +45,7 @@ i915-y := i915_drv.o \
>   	  i915_sw_fence.o \
>   	  i915_syncmap.o \
>   	  i915_sysfs.o \
> +	  i915_user_extensions.o \
>   	  intel_csr.o \
>   	  intel_device_info.o \
>   	  intel_pm.o \
> diff --git a/drivers/gpu/drm/i915/i915_user_extensions.c b/drivers/gpu/drm/i915/i915_user_extensions.c
> new file mode 100644
> index 000000000000..5d90c368f185
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_user_extensions.c
> @@ -0,0 +1,42 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2018 Intel Corporation
> + */
> +
> +#include <linux/sched/signal.h>
> +#include <linux/uaccess.h>
> +#include <uapi/drm/i915_drm.h>
> +
> +#include "i915_user_extensions.h"
> +
> +int i915_user_extensions(struct i915_user_extension __user *ext,
> +			 const i915_user_extension_fn *tbl,
> +			 unsigned long count,

I would be tempted to limit the count to unsigned int. 4 billion 
extensions should be enough for everyone. :)

ABI can remain u64, but implementation I think in this form does not 
need it.

> +			 void *data)
> +{
> +	while (ext) {
> +		int err;
> +		u64 x;
> +
> +		cond_resched();
> +		if (signal_pending(current))
> +			return -EINTR;

What was your thinking behind this? It feels like, since here we are not 
doing any explicit wait/sleeps, that at this level the code shouldn't 
bother with it.

> +
> +		if (get_user(x, &ext->name))
> +			return -EFAULT;
> +
> +		err = -EINVAL;
> +		if (x < count && tbl[x])
> +			err = tbl[x](ext, data);
> +		if (err)
> +			return err;

We talked about providing unwind on failure ie. option for destructor 
call backs. You gave up on that?

> +
> +		if (get_user(x, &ext->next_extension))
> +			return -EFAULT;
> +
> +		ext = u64_to_user_ptr(x);
> +	}
> +
> +	return 0;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_user_extensions.h b/drivers/gpu/drm/i915/i915_user_extensions.h
> new file mode 100644
> index 000000000000..313a510b068a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_user_extensions.h
> @@ -0,0 +1,20 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2018 Intel Corporation
> + */
> +
> +#ifndef I915_USER_EXTENSIONS_H
> +#define I915_USER_EXTENSIONS_H
> +
> +struct i915_user_extension;
> +
> +typedef int (*i915_user_extension_fn)(struct i915_user_extension __user *ext,
> +				      void *data);
> +
> +int i915_user_extensions(struct i915_user_extension __user *ext,
> +			 const i915_user_extension_fn *tbl,
> +			 unsigned long count,
> +			 void *data);
> +
> +#endif /* I915_USER_EXTENSIONS_H */
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 298b2e197744..6ee2221838da 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -62,6 +62,26 @@ extern "C" {
>   #define I915_ERROR_UEVENT		"ERROR"
>   #define I915_RESET_UEVENT		"RESET"
>   
> +/*
> + * i915_user_extension: Base class for defining a chain of extensions
> + *
> + * Many interfaces need to grow over time. In most cases we can simply
> + * extend the struct and have userspace pass in more data. Another option,
> + * as demonstrated by Vulkan's approach to providing extensions for forward
> + * and backward compatibility, is to use a list of optional structs to
> + * provide those extra details.
> + *
> + * The key advantage to using an extension chain is that it allows us to
> + * redefine the interface more easily than an ever growing struct of
> + * increasing complexity, and for large parts of that interface to be
> + * entirely optional. The downside is more pointer chasing; chasing across
> + * the __user boundary with pointers encapsulated inside u64.
> + */
> +struct i915_user_extension {
> +	__u64 next_extension;
> +	__u64 name;

s/name/id/ ?

> +};
> +
>   /*
>    * MOCS indexes used for GPU surfaces, defining the cacheability of the
>    * surface data and the coherency for this data wrt. CPU vs. GPU accesses.
> 

Regards,

Tvrtko
Chris Wilson Jan. 22, 2019, 10:47 a.m. UTC | #2
Quoting Tvrtko Ursulin (2019-01-22 09:31:31)
> 
> On 18/01/2019 14:00, Chris Wilson wrote:
> > +/*
> > + * SPDX-License-Identifier: MIT
> > + *
> > + * Copyright © 2018 Intel Corporation
> > + */
> > +
> > +#include <linux/sched/signal.h>
> > +#include <linux/uaccess.h>
> > +#include <uapi/drm/i915_drm.h>
> > +
> > +#include "i915_user_extensions.h"
> > +
> > +int i915_user_extensions(struct i915_user_extension __user *ext,
> > +                      const i915_user_extension_fn *tbl,
> > +                      unsigned long count,
> 
> I would be tempted to limit the count to unsigned int. 4 billion 
> extensions should be enough for everyone. :)

I just picked the natural type, thinking having it the same width as
ARRAY_SIZE might save a few questions from semantic analysers.

> > +{
> > +     while (ext) {
> > +             int err;
> > +             u64 x;
> > +
> > +             cond_resched();
> > +             if (signal_pending(current))
> > +                     return -EINTR;
> 
> What was your thinking behind this? It feels like, since here we are not 
> doing any explicit wait/sleeps, that at this level the code shouldn't 
> bother with it.

This ties into the discussion vvv

> > +             if (get_user(x, &ext->name))
> > +                     return -EFAULT;
> > +
> > +             err = -EINVAL;
> > +             if (x < count && tbl[x])
> > +                     err = tbl[x](ext, data);
> > +             if (err)
> > +                     return err;
> 
> We talked about providing unwind on failure ie. option for destructor 
> call backs. You gave up on that?

The patch is simply called 'merde'. Yes, unwind on failure does not lend
itself to a simple tail call implementation :) And it doesn't lend
itself nicely to writing the stacked cleanup handlers either. (So I
stuck with the solution of just doing a single cleanup on failure, safe
in the knowledge that the most complicated case in this series is
trivial.)

Thinking about the issues with providing a stack for unwind, leads to
the nasty question of how big a stack exactly do we want to provide.
Limiting the chain is required for defense against misuse, but what
depth? For the chained parameter setup of a single shot context create,
we could easily be into the dozens of parameters and extensions blocks.
The extreme I've been contemplating is a multi-batch execbuf setup (with
all the fancy extensions for e.g. semi-privileged fancy register setup),
for that I've been thinking about how this interface would extend to
supporting many chained chunks. What makes a good upper bound for stack
depth? 32? 64? 512? Pick a number, it won't be enough for someone. (Now,
really passing that much information through an ioctl means our design
is broken ;)

So... The break on interrupt there was for the silly protection against
recursion, if it doesn't result in an invalid command.

Another reason the patch was called merde.

I think the chained API extension is very powerful. Being able to do
arbitrary things like a single-shot context creation ioctl and still be
able to redefine/extend the interface as needs demands, is compelling.

> > +/*
> > + * i915_user_extension: Base class for defining a chain of extensions
> > + *
> > + * Many interfaces need to grow over time. In most cases we can simply
> > + * extend the struct and have userspace pass in more data. Another option,
> > + * as demonstrated by Vulkan's approach to providing extensions for forward
> > + * and backward compatibility, is to use a list of optional structs to
> > + * provide those extra details.
> > + *
> > + * The key advantage to using an extension chain is that it allows us to
> > + * redefine the interface more easily than an ever growing struct of
> > + * increasing complexity, and for large parts of that interface to be
> > + * entirely optional. The downside is more pointer chasing; chasing across
> > + * the __user boundary with pointers encapsulated inside u64.
> > + */
> > +struct i915_user_extension {
> > +     __u64 next_extension;
> > +     __u64 name;
> 
> s/name/id/ ?

I think name is common parlance for extensions/parameters? At least I've
been using it like it was :) And I was trying to retrospectively restrict
'id' for handles tracked by an idr! :)
-Chris
Tvrtko Ursulin Jan. 22, 2019, 11:05 a.m. UTC | #3
On 22/01/2019 10:47, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-01-22 09:31:31)
>>
>> On 18/01/2019 14:00, Chris Wilson wrote:
>>> +/*
>>> + * SPDX-License-Identifier: MIT
>>> + *
>>> + * Copyright © 2018 Intel Corporation
>>> + */
>>> +
>>> +#include <linux/sched/signal.h>
>>> +#include <linux/uaccess.h>
>>> +#include <uapi/drm/i915_drm.h>
>>> +
>>> +#include "i915_user_extensions.h"
>>> +
>>> +int i915_user_extensions(struct i915_user_extension __user *ext,
>>> +                      const i915_user_extension_fn *tbl,
>>> +                      unsigned long count,
>>
>> I would be tempted to limit the count to unsigned int. 4 billion
>> extensions should be enough for everyone. :)
> 
> I just picked the natural type, thinking having it the same width as
> ARRAY_SIZE might save a few questions from semantic analysers.
> 
>>> +{
>>> +     while (ext) {
>>> +             int err;
>>> +             u64 x;
>>> +
>>> +             cond_resched();
>>> +             if (signal_pending(current))
>>> +                     return -EINTR;
>>
>> What was your thinking behind this? It feels like, since here we are not
>> doing any explicit wait/sleeps, that at this level the code shouldn't
>> bother with it.
> 
> This ties into the discussion vvv
> 
>>> +             if (get_user(x, &ext->name))
>>> +                     return -EFAULT;
>>> +
>>> +             err = -EINVAL;
>>> +             if (x < count && tbl[x])
>>> +                     err = tbl[x](ext, data);
>>> +             if (err)
>>> +                     return err;
>>
>> We talked about providing unwind on failure ie. option for destructor
>> call backs. You gave up on that?
> 
> The patch is simply called 'merde'. Yes, unwind on failure does not lend
> itself to a simple tail call implementation :) And it doesn't lend
> itself nicely to writing the stacked cleanup handlers either. (So I
> stuck with the solution of just doing a single cleanup on failure, safe
> in the knowledge that the most complicated case in this series is
> trivial.)
> 
> Thinking about the issues with providing a stack for unwind, leads to
> the nasty question of how big a stack exactly do we want to provide.
> Limiting the chain is required for defense against misuse, but what
> depth? For the chained parameter setup of a single shot context create,
> we could easily be into the dozens of parameters and extensions blocks.
> The extreme I've been contemplating is a multi-batch execbuf setup (with
> all the fancy extensions for e.g. semi-privileged fancy register setup),
> for that I've been thinking about how this interface would extend to
> supporting many chained chunks. What makes a good upper bound for stack
> depth? 32? 64? 512? Pick a number, it won't be enough for someone. (Now,
> really passing that much information through an ioctl means our design
> is broken ;)
> 
> So... The break on interrupt there was for the silly protection against
> recursion, if it doesn't result in an invalid command.
> 
> Another reason the patch was called merde.
> 
> I think the chained API extension is very powerful. Being able to do
> arbitrary things like a single-shot context creation ioctl and still be
> able to redefine/extend the interface as needs demands, is compelling.

I did not think about unwind implementation details. :)

We could make the ABI double linked list? But the feeling is bad..

Or have some reasonable stack size and if overflows fall back to a very 
inefficient lookup from root to walk backwards. Sounds okay for this use 
case since we would never overflow even a small stack, and even the 
inefficient walk on unwind would be a) fast and b) rare.

But since you mention stack sizes like 512 I wonder I did not quite 
grasp how much you'd want to use extensions in the future. :)

>>> +/*
>>> + * i915_user_extension: Base class for defining a chain of extensions
>>> + *
>>> + * Many interfaces need to grow over time. In most cases we can simply
>>> + * extend the struct and have userspace pass in more data. Another option,
>>> + * as demonstrated by Vulkan's approach to providing extensions for forward
>>> + * and backward compatibility, is to use a list of optional structs to
>>> + * provide those extra details.
>>> + *
>>> + * The key advantage to using an extension chain is that it allows us to
>>> + * redefine the interface more easily than an ever growing struct of
>>> + * increasing complexity, and for large parts of that interface to be
>>> + * entirely optional. The downside is more pointer chasing; chasing across
>>> + * the __user boundary with pointers encapsulated inside u64.
>>> + */
>>> +struct i915_user_extension {
>>> +     __u64 next_extension;
>>> +     __u64 name;
>>
>> s/name/id/ ?
> 
> I think name is common parlance for extensions/parameters? At least I've
> been using it like it was :) And I was trying to retrospectively restrict
> 'id' for handles tracked by an idr! :)

I don't know if it is common, it is completely new to me. Maybe it is.. 
Or how about type?

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 611115ed00db..f206fbc85cee 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -45,6 +45,7 @@  i915-y := i915_drv.o \
 	  i915_sw_fence.o \
 	  i915_syncmap.o \
 	  i915_sysfs.o \
+	  i915_user_extensions.o \
 	  intel_csr.o \
 	  intel_device_info.o \
 	  intel_pm.o \
diff --git a/drivers/gpu/drm/i915/i915_user_extensions.c b/drivers/gpu/drm/i915/i915_user_extensions.c
new file mode 100644
index 000000000000..5d90c368f185
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_user_extensions.c
@@ -0,0 +1,42 @@ 
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#include <linux/sched/signal.h>
+#include <linux/uaccess.h>
+#include <uapi/drm/i915_drm.h>
+
+#include "i915_user_extensions.h"
+
+int i915_user_extensions(struct i915_user_extension __user *ext,
+			 const i915_user_extension_fn *tbl,
+			 unsigned long count,
+			 void *data)
+{
+	while (ext) {
+		int err;
+		u64 x;
+
+		cond_resched();
+		if (signal_pending(current))
+			return -EINTR;
+
+		if (get_user(x, &ext->name))
+			return -EFAULT;
+
+		err = -EINVAL;
+		if (x < count && tbl[x])
+			err = tbl[x](ext, data);
+		if (err)
+			return err;
+
+		if (get_user(x, &ext->next_extension))
+			return -EFAULT;
+
+		ext = u64_to_user_ptr(x);
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_user_extensions.h b/drivers/gpu/drm/i915/i915_user_extensions.h
new file mode 100644
index 000000000000..313a510b068a
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_user_extensions.h
@@ -0,0 +1,20 @@ 
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#ifndef I915_USER_EXTENSIONS_H
+#define I915_USER_EXTENSIONS_H
+
+struct i915_user_extension;
+
+typedef int (*i915_user_extension_fn)(struct i915_user_extension __user *ext,
+				      void *data);
+
+int i915_user_extensions(struct i915_user_extension __user *ext,
+			 const i915_user_extension_fn *tbl,
+			 unsigned long count,
+			 void *data);
+
+#endif /* I915_USER_EXTENSIONS_H */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 298b2e197744..6ee2221838da 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -62,6 +62,26 @@  extern "C" {
 #define I915_ERROR_UEVENT		"ERROR"
 #define I915_RESET_UEVENT		"RESET"
 
+/*
+ * i915_user_extension: Base class for defining a chain of extensions
+ *
+ * Many interfaces need to grow over time. In most cases we can simply
+ * extend the struct and have userspace pass in more data. Another option,
+ * as demonstrated by Vulkan's approach to providing extensions for forward
+ * and backward compatibility, is to use a list of optional structs to
+ * provide those extra details.
+ *
+ * The key advantage to using an extension chain is that it allows us to
+ * redefine the interface more easily than an ever growing struct of
+ * increasing complexity, and for large parts of that interface to be
+ * entirely optional. The downside is more pointer chasing; chasing across
+ * the __user boundary with pointers encapsulated inside u64.
+ */
+struct i915_user_extension {
+	__u64 next_extension;
+	__u64 name;
+};
+
 /*
  * MOCS indexes used for GPU surfaces, defining the cacheability of the
  * surface data and the coherency for this data wrt. CPU vs. GPU accesses.