[1/4] drm: add prime helpers

Message ID	1354817271-5121-2-git-send-email-aplattner@nvidia.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org> From: Aaron Plattner <aplattner@nvidia.com> To: <dri-devel@lists.freedesktop.org>, Jerome Glisse <j.glisse@gmail.com>, David Airlie <airlied@linux.ie> Subject: [PATCH 1/4] drm: add prime helpers Date: Thu, 6 Dec 2012 10:07:48 -0800 Message-ID: <1354817271-5121-2-git-send-email-aplattner@nvidia.com> In-Reply-To: <1354817271-5121-1-git-send-email-aplattner@nvidia.com> References: <1354817271-5121-1-git-send-email-aplattner@nvidia.com> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org

Aaron Plattner Dec. 6, 2012, 6:07 p.m. UTC

Instead of reimplementing all of the dma_buf functionality in every driver,
create helpers drm_prime_import and drm_prime_export that implement them in
terms of new, lower-level hook functions:

  gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT
  gem_prime_get_pages: convert a drm_gem_object to an sg_table for export
  gem_prime_import_sg: convert an sg_table into a drm_gem_object
  gem_prime_vmap, gem_prime_vunmap: map and unmap an object

These hooks are optional; drivers can opt in by using drm_gem_prime_import and
drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of
struct drm_driver.

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
---
 drivers/gpu/drm/drm_prime.c | 190 +++++++++++++++++++++++++++++++++++++++++++-
 include/drm/drmP.h          |  15 ++++
 2 files changed, 204 insertions(+), 1 deletion(-)

Daniel Vetter Dec. 6, 2012, 9:46 p.m. UTC | #1

On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:
> Instead of reimplementing all of the dma_buf functionality in every driver,
> create helpers drm_prime_import and drm_prime_export that implement them in
> terms of new, lower-level hook functions:
> 
>   gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT
>   gem_prime_get_pages: convert a drm_gem_object to an sg_table for export
>   gem_prime_import_sg: convert an sg_table into a drm_gem_object
>   gem_prime_vmap, gem_prime_vunmap: map and unmap an object
> 
> These hooks are optional; drivers can opt in by using drm_gem_prime_import and
> drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of
> struct drm_driver.
> 
> Signed-off-by: Aaron Plattner <aplattner@nvidia.com>

A few comments:

- Can you please add kerneldoc entries for these new helpers? Laurent
  Pinchart started a nice drm kerneldoc as an overview. And since then
  we've at least integrated the kerneldoc api reference at a nice place
  there and brushed it up a bit when touching drm core or helpers in a
  bigger patch series. See e.g. my recent dp helper series for what I'm
  looking for. I think we should have kerneldoc at least for the exported
  functions.

- Just an idea for all the essential noop cpu helpers: Maybe just call
  them dma_buf*noop and shovel them as convenience helpers into the
  dma-buf.c code in the core, for drivers that don't want/can't/won't
  bother to implement the cpu access support. Related: Exporting the
  functions in general so that drivers could pick&choose

- s/gem_prime_get_pages/gem_prime_get_sg/ drm/i915 now uses sg_tables
  everywhere to manage backing storage, and we're starting to use the
  stolen memory, too. Stolen memory is not struct page backed, so better
  make this clear in the function calls. I know that the current
  prime/dma_buf code from Dave doesn't really work too well (*cough*) if
  the backing storage isn't struct page backed, at least on the ttm import
  side. This also links in with my 2nd comment above - dma_buf requires
  that exporter map the sg into the importers device space to support fake
  subdevices which share the exporters iommu/gart, hence such incetious
  exporter might want to overwrite some of the functions.

- To make it a first-class helper and remove any and all midlayer stints
  from this (I truly loath midlayers ...) maybe shovel this into a
  drm_prime_helper.c file. Also, adding functions to the core vtable for
  pure helpers is usually not too neat, since it makes it way too damn
  easy to mix things up and transform the helper into a midlayer by
  accident. E.g. the crtc helper functions all just use a generic void *
  pointer for their vtables, other helpers either subclass an existing
  struct or use their completely independent structs (which the driver can
  both embed into it's own object struct and then do the right upclassing
  in the callbacks). tbh haven't looked to closely at the series to asses
  what would make sense, but we have way too much gunk in the drm vtable
  already ...

Dunno whether this aligns with what you want to achieve here. But on a
quick hunch this code feels rather unflexible and just refactors the
identical copypasta code back into one copy. Anyway, just figured I'll
drop my 2 cents here, feel free to ignore (maybe safe for the last one).

Cheers, Daniel


> ---
>  drivers/gpu/drm/drm_prime.c | 190 +++++++++++++++++++++++++++++++++++++++++++-
>  include/drm/drmP.h          |  15 ++++
>  2 files changed, 204 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 7f12573..566c2ac 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -53,7 +53,8 @@
>   * Self-importing: if userspace is using PRIME as a replacement for flink
>   * then it will get a fd->handle request for a GEM object that it created.
>   * Drivers should detect this situation and return back the gem object
> - * from the dma-buf private.
> + * from the dma-buf private.  Prime will do this automatically for drivers that
> + * use the drm_gem_prime_{import,export} helpers.
>   */
>  
>  struct drm_prime_member {
> @@ -62,6 +63,146 @@ struct drm_prime_member {
>  	uint32_t handle;
>  };
>  
> +static struct sg_table *drm_gem_map_dma_buf(struct dma_buf_attachment *attach,
> +		enum dma_data_direction dir)
> +{
> +	struct drm_gem_object *obj = attach->dmabuf->priv;
> +	struct sg_table *st;
> +
> +	mutex_lock_interruptible(&obj->dev->struct_mutex);
> +
> +	st = obj->dev->driver->gem_prime_get_pages(obj);
> +
> +	if (!IS_ERR_OR_NULL(st))
> +		dma_map_sg(attach->dev, st->sgl, st->nents, dir);
> +
> +	mutex_unlock(&obj->dev->struct_mutex);
> +	return st;
> +}
> +
> +static void drm_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
> +		struct sg_table *sg, enum dma_data_direction dir)
> +{
> +	dma_unmap_sg(attach->dev, sg->sgl, sg->nents, dir);
> +	sg_free_table(sg);
> +	kfree(sg);
> +}
> +
> +static void drm_gem_dmabuf_release(struct dma_buf *dma_buf)
> +{
> +	struct drm_gem_object *obj = dma_buf->priv;
> +
> +	if (obj->export_dma_buf == dma_buf) {
> +		/* drop the reference on the export fd holds */
> +		obj->export_dma_buf = NULL;
> +		drm_gem_object_unreference_unlocked(obj);
> +	}
> +}
> +
> +static void *drm_gem_dmabuf_vmap(struct dma_buf *dma_buf)
> +{
> +	struct drm_gem_object *obj = dma_buf->priv;
> +	struct drm_device *dev = obj->dev;
> +	void *virtual = ERR_PTR(-EINVAL);
> +
> +	if (!dev->driver->gem_prime_vmap)
> +		return ERR_PTR(-EINVAL);
> +
> +	mutex_lock(&dev->struct_mutex);
> +	if (obj->vmapping_count) {
> +		obj->vmapping_count++;
> +		virtual = obj->vmapping_ptr;
> +		goto out_unlock;
> +	}
> +
> +	virtual = dev->driver->gem_prime_vmap(obj);
> +	if (IS_ERR(virtual)) {
> +		mutex_unlock(&dev->struct_mutex);
> +		return virtual;
> +	}
> +	obj->vmapping_count = 1;
> +	obj->vmapping_ptr = virtual;
> +out_unlock:
> +	mutex_unlock(&dev->struct_mutex);
> +	return obj->vmapping_ptr;
> +}
> +
> +static void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
> +{
> +	struct drm_gem_object *obj = dma_buf->priv;
> +	struct drm_device *dev = obj->dev;
> +
> +	mutex_lock(&dev->struct_mutex);
> +	obj->vmapping_count--;
> +	if (obj->vmapping_count == 0) {
> +		dev->driver->gem_prime_vunmap(obj);
> +		obj->vmapping_ptr = NULL;
> +	}
> +	mutex_unlock(&dev->struct_mutex);
> +}
> +
> +static void *drm_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf,
> +		unsigned long page_num)
> +{
> +	return NULL;
> +}
> +
> +static void drm_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf,
> +		unsigned long page_num, void *addr)
> +{
> +
> +}
> +static void *drm_gem_dmabuf_kmap(struct dma_buf *dma_buf,
> +		unsigned long page_num)
> +{
> +	return NULL;
> +}
> +
> +static void drm_gem_dmabuf_kunmap(struct dma_buf *dma_buf,
> +		unsigned long page_num, void *addr)
> +{
> +
> +}
> +
> +static int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf,
> +		struct vm_area_struct *vma)
> +{
> +	return -EINVAL;
> +}
> +
> +static int drm_gem_begin_cpu_access(struct dma_buf *dma_buf,
> +		size_t start, size_t length, enum dma_data_direction direction)
> +{
> +	return -EINVAL;
> +}
> +
> +static const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  {
> +	.map_dma_buf = drm_gem_map_dma_buf,
> +	.unmap_dma_buf = drm_gem_unmap_dma_buf,
> +	.release = drm_gem_dmabuf_release,
> +	.kmap = drm_gem_dmabuf_kmap,
> +	.kmap_atomic = drm_gem_dmabuf_kmap_atomic,
> +	.kunmap = drm_gem_dmabuf_kunmap,
> +	.kunmap_atomic = drm_gem_dmabuf_kunmap_atomic,
> +	.mmap = drm_gem_dmabuf_mmap,
> +	.vmap = drm_gem_dmabuf_vmap,
> +	.vunmap = drm_gem_dmabuf_vunmap,
> +	.begin_cpu_access = drm_gem_begin_cpu_access,
> +};
> +
> +struct dma_buf *drm_gem_prime_export(struct drm_device *dev,
> +				     struct drm_gem_object *obj, int flags)
> +{
> +	if (dev->driver->gem_prime_pin) {
> +		int ret = dev->driver->gem_prime_pin(obj);
> +		if (ret)
> +			return ERR_PTR(ret);
> +	}
> +	return dma_buf_export(obj, &drm_gem_prime_dmabuf_ops, obj->size,
> +			      0600);
> +}
> +EXPORT_SYMBOL(drm_gem_prime_export);
> +
>  int drm_gem_prime_handle_to_fd(struct drm_device *dev,
>  		struct drm_file *file_priv, uint32_t handle, uint32_t flags,
>  		int *prime_fd)
> @@ -117,6 +258,53 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev,
>  }
>  EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);
>  
> +struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
> +					    struct dma_buf *dma_buf)
> +{
> +	struct dma_buf_attachment *attach;
> +	struct sg_table *sg;
> +	struct drm_gem_object *obj;
> +	int ret;
> +
> +	if (!dev->driver->gem_prime_import_sg)
> +		return ERR_PTR(-EINVAL);
> +
> +	if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) {
> +		obj = dma_buf->priv;
> +		if (obj->dev == dev) {
> +			drm_gem_object_reference(obj);
> +			return obj;
> +		}
> +	}
> +
> +	attach = dma_buf_attach(dma_buf, dev->dev);
> +	if (IS_ERR(attach))
> +		return ERR_PTR(PTR_ERR(attach));
> +
> +	sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
> +	if (IS_ERR_OR_NULL(sg)) {
> +		ret = PTR_ERR(sg);
> +		goto fail_detach;
> +	}
> +
> +	obj = dev->driver->gem_prime_import_sg(dev, dma_buf->size, sg);
> +	if (IS_ERR(obj)) {
> +		ret = PTR_ERR(obj);
> +		goto fail_unmap;
> +	}
> +
> +	obj->import_attach = attach;
> +
> +	return obj;
> +
> +fail_unmap:
> +	dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL);
> +fail_detach:
> +	dma_buf_detach(dma_buf, attach);
> +	return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL(drm_gem_prime_import);
> +
>  int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>  		struct drm_file *file_priv, int prime_fd, uint32_t *handle)
>  {
> diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> index fad21c9..f65cae9 100644
> --- a/include/drm/drmP.h
> +++ b/include/drm/drmP.h
> @@ -674,6 +674,10 @@ struct drm_gem_object {
>  
>  	/* dma buf attachment backing this object */
>  	struct dma_buf_attachment *import_attach;
> +
> +	/* vmap information */
> +	int vmapping_count;
> +	void *vmapping_ptr;
>  };
>  
>  #include <drm/drm_crtc.h>
> @@ -919,6 +923,13 @@ struct drm_driver {
>  	/* import dmabuf -> GEM */
>  	struct drm_gem_object * (*gem_prime_import)(struct drm_device *dev,
>  				struct dma_buf *dma_buf);
> +	/* low-level interface used by drm_gem_prime_{import,export} */
> +	int (*gem_prime_pin)(struct drm_gem_object *obj);
> +	struct sg_table *(*gem_prime_get_pages)(struct drm_gem_object *obj);
> +	struct drm_gem_object *(*gem_prime_import_sg)(struct drm_device *dev,
> +				size_t size, struct sg_table *sg);
> +	void *(*gem_prime_vmap)(struct drm_gem_object *obj);
> +	void (*gem_prime_vunmap)(struct drm_gem_object *obj);
>  
>  	/* vga arb irq handler */
>  	void (*vgaarb_irq)(struct drm_device *dev, bool state);
> @@ -1540,9 +1551,13 @@ extern int drm_clients_info(struct seq_file *m, void* data);
>  extern int drm_gem_name_info(struct seq_file *m, void *data);
>  
>  
> +extern struct dma_buf *drm_gem_prime_export(struct drm_device *dev,
> +		struct drm_gem_object *obj, int flags);
>  extern int drm_gem_prime_handle_to_fd(struct drm_device *dev,
>  		struct drm_file *file_priv, uint32_t handle, uint32_t flags,
>  		int *prime_fd);
> +extern struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
> +		struct dma_buf *dma_buf);
>  extern int drm_gem_prime_fd_to_handle(struct drm_device *dev,
>  		struct drm_file *file_priv, int prime_fd, uint32_t *handle);
>  
> -- 
> 1.8.0.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter Dec. 6, 2012, 9:50 p.m. UTC | #2

On Thu, Dec 06, 2012 at 10:46:30PM +0100, Daniel Vetter wrote:
> On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:
> > Instead of reimplementing all of the dma_buf functionality in every driver,
> > create helpers drm_prime_import and drm_prime_export that implement them in
> > terms of new, lower-level hook functions:
> > 
> >   gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT
> >   gem_prime_get_pages: convert a drm_gem_object to an sg_table for export
> >   gem_prime_import_sg: convert an sg_table into a drm_gem_object
> >   gem_prime_vmap, gem_prime_vunmap: map and unmap an object
> > 
> > These hooks are optional; drivers can opt in by using drm_gem_prime_import and
> > drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of
> > struct drm_driver.
> > 
> > Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
> 
> A few comments:
> 
> - Can you please add kerneldoc entries for these new helpers? Laurent
>   Pinchart started a nice drm kerneldoc as an overview. And since then
>   we've at least integrated the kerneldoc api reference at a nice place
>   there and brushed it up a bit when touching drm core or helpers in a
>   bigger patch series. See e.g. my recent dp helper series for what I'm
>   looking for. I think we should have kerneldoc at least for the exported
>   functions.
> 
> - Just an idea for all the essential noop cpu helpers: Maybe just call
>   them dma_buf*noop and shovel them as convenience helpers into the
>   dma-buf.c code in the core, for drivers that don't want/can't/won't
>   bother to implement the cpu access support. Related: Exporting the
>   functions in general so that drivers could pick&choose

One more: For the cpu access noop helpers I'd vote for -ENOTTY as the more
canonical "not implemented error, you're talking to the wrong thing" error
instead of -EINVAL, which an exporter could throw back to the importer if
e.g. the range is outside of the size of the dma_buf. With a quick dma_buf
doc update we could then bless this as the official way to denounce cpu
access support.
-Daniel

Maarten Lankhorst Dec. 7, 2012, 2:03 p.m. UTC | #3

Op 06-12-12 22:50, Daniel Vetter schreef:
> On Thu, Dec 06, 2012 at 10:46:30PM +0100, Daniel Vetter wrote:
>> On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:
>>> Instead of reimplementing all of the dma_buf functionality in every driver,
>>> create helpers drm_prime_import and drm_prime_export that implement them in
>>> terms of new, lower-level hook functions:
>>>
>>>   gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT
>>>   gem_prime_get_pages: convert a drm_gem_object to an sg_table for export
>>>   gem_prime_import_sg: convert an sg_table into a drm_gem_object
>>>   gem_prime_vmap, gem_prime_vunmap: map and unmap an object
>>>
>>> These hooks are optional; drivers can opt in by using drm_gem_prime_import and
>>> drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of
>>> struct drm_driver.
>>>
>>> Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
>> A few comments:
>>
>> - Can you please add kerneldoc entries for these new helpers? Laurent
>>   Pinchart started a nice drm kerneldoc as an overview. And since then
>>   we've at least integrated the kerneldoc api reference at a nice place
>>   there and brushed it up a bit when touching drm core or helpers in a
>>   bigger patch series. See e.g. my recent dp helper series for what I'm
>>   looking for. I think we should have kerneldoc at least for the exported
>>   functions.
>>
>> - Just an idea for all the essential noop cpu helpers: Maybe just call
>>   them dma_buf*noop and shovel them as convenience helpers into the
>>   dma-buf.c code in the core, for drivers that don't want/can't/won't
>>   bother to implement the cpu access support. Related: Exporting the
>>   functions in general so that drivers could pick&choose
> One more: For the cpu access noop helpers I'd vote for -ENOTTY as the more
> canonical "not implemented error, you're talking to the wrong thing" error
> instead of -EINVAL, which an exporter could throw back to the importer if
> e.g. the range is outside of the size of the dma_buf. With a quick dma_buf
> doc update we could then bless this as the official way to denounce cpu
> access support.
>
Yeah lets reinvent a new error to return, and make it not a typewriter to confuse users,
instead of using -ENODEV which would actually be valid and most descriptive here if you
read the mmap page:

-ENODEV The underlying file system of the specified file does not support memory mapping.

~Maarten

Aaron Plattner Dec. 7, 2012, 5:58 p.m. UTC | #4

On 12/06/2012 01:46 PM, Daniel Vetter wrote:
> On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:
>> Instead of reimplementing all of the dma_buf functionality in every driver,
>> create helpers drm_prime_import and drm_prime_export that implement them in
>> terms of new, lower-level hook functions:
>>
>>    gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT
>>    gem_prime_get_pages: convert a drm_gem_object to an sg_table for export
>>    gem_prime_import_sg: convert an sg_table into a drm_gem_object
>>    gem_prime_vmap, gem_prime_vunmap: map and unmap an object
>>
>> These hooks are optional; drivers can opt in by using drm_gem_prime_import and
>> drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of
>> struct drm_driver.
>>
>> Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
>
> A few comments:
>
> - Can you please add kerneldoc entries for these new helpers? Laurent
>    Pinchart started a nice drm kerneldoc as an overview. And since then
>    we've at least integrated the kerneldoc api reference at a nice place
>    there and brushed it up a bit when touching drm core or helpers in a
>    bigger patch series. See e.g. my recent dp helper series for what I'm
>    looking for. I think we should have kerneldoc at least for the exported
>    functions.

Good call, I'll look into that.

> - Just an idea for all the essential noop cpu helpers: Maybe just call
>    them dma_buf*noop and shovel them as convenience helpers into the
>    dma-buf.c code in the core, for drivers that don't want/can't/won't
>    bother to implement the cpu access support. Related: Exporting the
>    functions in general so that drivers could pick&choose

Seems like a good idea as a follow-on change.  Do you think it would 
make more sense to have noop helpers, or allow them to be NULL in dma-buf.c?

> - s/gem_prime_get_pages/gem_prime_get_sg/ drm/i915 now uses sg_tables
>    everywhere to manage backing storage, and we're starting to use the
>    stolen memory, too. Stolen memory is not struct page backed, so better
>    make this clear in the function calls. I know that the current
>    prime/dma_buf code from Dave doesn't really work too well (*cough*) if
>    the backing storage isn't struct page backed, at least on the ttm import
>    side. This also links in with my 2nd comment above - dma_buf requires
>    that exporter map the sg into the importers device space to support fake
>    subdevices which share the exporters iommu/gart, hence such incetious
>    exporter might want to overwrite some of the functions.

Will do in a v2 set.

> - To make it a first-class helper and remove any and all midlayer stints
>    from this (I truly loath midlayers ...) maybe shovel this into a
>    drm_prime_helper.c file. Also, adding functions to the core vtable for
>    pure helpers is usually not too neat, since it makes it way too damn
>    easy to mix things up and transform the helper into a midlayer by
>    accident. E.g. the crtc helper functions all just use a generic void *
>    pointer for their vtables, other helpers either subclass an existing
>    struct or use their completely independent structs (which the driver can
>    both embed into it's own object struct and then do the right upclassing
>    in the callbacks). tbh haven't looked to closely at the series to asses
>    what would make sense, but we have way too much gunk in the drm vtable
>    already ...

I'm not sure I fully understand what you mean by this.  Are you
suggesting creating a separate struct for these functions and having a
void* pointer to that in struct drm_driver?

> Dunno whether this aligns with what you want to achieve here. But on a
> quick hunch this code feels rather unflexible and just refactors the
> identical copypasta code back into one copy. Anyway, just figured I'll
> drop my 2 cents here, feel free to ignore (maybe safe for the last one).

I think uncopypasta-ing the current code is beneficial on its own.  Do 
you think doing this now would make it harder to do the further 
refactoring you suggest later?

--
Aaron

Daniel Vetter Dec. 7, 2012, 6:48 p.m. UTC | #5

On Fri, Dec 07, 2012 at 09:58:38AM -0800, Aaron Plattner wrote:
> On 12/06/2012 01:46 PM, Daniel Vetter wrote:

[snip]

> >- Just an idea for all the essential noop cpu helpers: Maybe just call
> >   them dma_buf*noop and shovel them as convenience helpers into the
> >   dma-buf.c code in the core, for drivers that don't want/can't/won't
> >   bother to implement the cpu access support. Related: Exporting the
> >   functions in general so that drivers could pick&choose
> 
> Seems like a good idea as a follow-on change.  Do you think it would
> make more sense to have noop helpers, or allow them to be NULL in
> dma-buf.c?

Yeah, sounds good. Just thought that drm drivers aren't the only ones not
caring about cpu access support too much, so an officially blessed way
with documentation and unified errno would be good

[snip]

> >- To make it a first-class helper and remove any and all midlayer stints
> >   from this (I truly loath midlayers ...) maybe shovel this into a
> >   drm_prime_helper.c file. Also, adding functions to the core vtable for
> >   pure helpers is usually not too neat, since it makes it way too damn
> >   easy to mix things up and transform the helper into a midlayer by
> >   accident. E.g. the crtc helper functions all just use a generic void *
> >   pointer for their vtables, other helpers either subclass an existing
> >   struct or use their completely independent structs (which the driver can
> >   both embed into it's own object struct and then do the right upclassing
> >   in the callbacks). tbh haven't looked to closely at the series to asses
> >   what would make sense, but we have way too much gunk in the drm vtable
> >   already ...
> 
> I'm not sure I fully understand what you mean by this.  Are you
> suggesting creating a separate struct for these functions and having a
> void* pointer to that in struct drm_driver?

Create a new struct like

struct  drm_prime_helper_funcs {
       int (*gem_prime_pin)(struct drm_gem_object *obj);
       struct sg_table *(*gem_prime_get_pages)(struct drm_gem_object *obj);
       struct drm_gem_object *(*gem_prime_import_sg)(struct drm_device *dev,
                               size_t size, struct sg_table *sg);
       void *(*gem_prime_vmap)(struct drm_gem_object *obj);
       void (*gem_prime_vunmap)(struct drm_gem_object *obj);
}

struct drm_prime_helper_obj {
       /* vmap information */
       int vmapping_count;
       void *vmapping_ptr;

	/* any other data the helpers need ... */
	struct drm_prime_helper_funcs *funcs;
}

Drivers then fill out a func vtable and embedded the helper obj into their
own gpu bo struct, i.e.

struct radeon_bo {
	struct ttm_bo ttm_bo;
	struct gem_buffer_object gem_bo;
	struct drm_prime_helper_obj prime_bo;

	/* other stuff */
}

In the prime helper callbacks in the driver then upcast the prime_bo to
the radeon_bo instead of the gem_bo to the radeon bo.

Upsides: Guaranteed to not interfere with drivers not using it, with an
imo higher chance that the helper is cleanly separate from core stuff (no
guarantee ofc) and so easier to refactor in the future. And drm is full
with legacy stuff used by very few drivers, but highly intermingled with
drm core used by other drivers (my little holy war here is to slowly
refactor those things out and shovel them into drivers), hence why I
prefer to have an as clean separation of optional stuff as possible ...

Also, this way allows different drivers to use completely different helper
libraries for the same task, without those two helper libraries ever
interfering.

Downside: Adds a pointer to each bo struct for drivers using the helpers.
Given how big those tend to be, not too onerous.  Given And maybe a bit of
confusion in driver callbacks, but since the linux container_of is
typechecked by the compiler, not too onerous either.

> >Dunno whether this aligns with what you want to achieve here. But on a
> >quick hunch this code feels rather unflexible and just refactors the
> >identical copypasta code back into one copy. Anyway, just figured I'll
> >drop my 2 cents here, feel free to ignore (maybe safe for the last one).
> 
> I think uncopypasta-ing the current code is beneficial on its own.
> Do you think doing this now would make it harder to do the further
> refactoring you suggest later?

If we later on go with something like the above, it'll require going over
all drivers again, doing similar mechanic changes. If no one yet managed
to intermingle the helpers with the core in the meantime, that is ;-)

Cheers, Daniel

Aaron Plattner Dec. 7, 2012, 8:33 p.m. UTC | #6

On 12/07/2012 10:48 AM, Daniel Vetter wrote:
> On Fri, Dec 07, 2012 at 09:58:38AM -0800, Aaron Plattner wrote:
>> On 12/06/2012 01:46 PM, Daniel Vetter wrote:
>>> - To make it a first-class helper and remove any and all midlayer stints
>>>    from this (I truly loath midlayers ...) maybe shovel this into a
>>>    drm_prime_helper.c file. Also, adding functions to the core vtable for
>>>    pure helpers is usually not too neat, since it makes it way too damn
>>>    easy to mix things up and transform the helper into a midlayer by
>>>    accident. E.g. the crtc helper functions all just use a generic void *
>>>    pointer for their vtables, other helpers either subclass an existing
>>>    struct or use their completely independent structs (which the driver can
>>>    both embed into it's own object struct and then do the right upclassing
>>>    in the callbacks). tbh haven't looked to closely at the series to asses
>>>    what would make sense, but we have way too much gunk in the drm vtable
>>>    already ...
>>
>> I'm not sure I fully understand what you mean by this.  Are you
>> suggesting creating a separate struct for these functions and having a
>> void* pointer to that in struct drm_driver?
>
> Create a new struct like
>
> struct  drm_prime_helper_funcs {
>         int (*gem_prime_pin)(struct drm_gem_object *obj);
>         struct sg_table *(*gem_prime_get_pages)(struct drm_gem_object *obj);
>         struct drm_gem_object *(*gem_prime_import_sg)(struct drm_device *dev,
>                                 size_t size, struct sg_table *sg);
>         void *(*gem_prime_vmap)(struct drm_gem_object *obj);
>         void (*gem_prime_vunmap)(struct drm_gem_object *obj);
> }
>
> struct drm_prime_helper_obj {
>         /* vmap information */
>         int vmapping_count;
>         void *vmapping_ptr;
>
> 	/* any other data the helpers need ... */
> 	struct drm_prime_helper_funcs *funcs;
> }

Ah, I see what you mean.  This would also need either a pointer to the 
drv directly so the helpers can lock drv->struct_mutex, or a helper 
function to convert from a drm_prime_helper_obj* to a gem_buffer_object* 
in a driver-specific way since the helper doesn't know the layout of the 
driver's bo structure.

So it would be something like

   struct drm_prime_helper_obj *helper_obj = dma_buf->priv;
   struct drm_prime_helper_funcs *funcs = helper_obj->funcs;
   struct drm_gem_object *obj = funcs->get_gem_obj(helper_obj);

   mutex_lock(&obj->dev->struct_mutex);
   funcs->whatever(obj);
   mutex_unlock&obj->dev->struct_mutex);

and then for example, radeon_drm_prime_get_gem_obj would be

   struct radeon_bo *bo = container_of(helper_obj, struct radeon_bo,
                                       prime_helper);
   return &bo->gem_base;

?

That seems a little over-engineered to me, but I can code it up.

--
Aaron

> Drivers then fill out a func vtable and embedded the helper obj into their
> own gpu bo struct, i.e.
>
> struct radeon_bo {
> 	struct ttm_bo ttm_bo;
> 	struct gem_buffer_object gem_bo;
> 	struct drm_prime_helper_obj prime_bo;
>
> 	/* other stuff */
> }
>
> In the prime helper callbacks in the driver then upcast the prime_bo to
> the radeon_bo instead of the gem_bo to the radeon bo.
>
> Upsides: Guaranteed to not interfere with drivers not using it, with an
> imo higher chance that the helper is cleanly separate from core stuff (no
> guarantee ofc) and so easier to refactor in the future. And drm is full
> with legacy stuff used by very few drivers, but highly intermingled with
> drm core used by other drivers (my little holy war here is to slowly
> refactor those things out and shovel them into drivers), hence why I
> prefer to have an as clean separation of optional stuff as possible ...
>
> Also, this way allows different drivers to use completely different helper
> libraries for the same task, without those two helper libraries ever
> interfering.
>
> Downside: Adds a pointer to each bo struct for drivers using the helpers.
> Given how big those tend to be, not too onerous.  Given And maybe a bit of
> confusion in driver callbacks, but since the linux container_of is
> typechecked by the compiler, not too onerous either.
>
>>> Dunno whether this aligns with what you want to achieve here. But on a
>>> quick hunch this code feels rather unflexible and just refactors the
>>> identical copypasta code back into one copy. Anyway, just figured I'll
>>> drop my 2 cents here, feel free to ignore (maybe safe for the last one).
>>
>> I think uncopypasta-ing the current code is beneficial on its own.
>> Do you think doing this now would make it harder to do the further
>> refactoring you suggest later?
>
> If we later on go with something like the above, it'll require going over
> all drivers again, doing similar mechanic changes. If no one yet managed
> to intermingle the helpers with the core in the meantime, that is ;-)
>
> Cheers, Daniel
>

Daniel Vetter Dec. 7, 2012, 9:38 p.m. UTC | #7

On Fri, Dec 7, 2012 at 9:33 PM, Aaron Plattner <aplattner@nvidia.com> wrote:
> Ah, I see what you mean.  This would also need either a pointer to the drv
> directly so the helpers can lock drv->struct_mutex, or a helper function to
> convert from a drm_prime_helper_obj* to a gem_buffer_object* in a
> driver-specific way since the helper doesn't know the layout of the driver's
> bo structure.

Ah, locking, and it involves dev->struct_mutex. Another pet peeve of
mine ;-) Tbh I haven't looked at locking issues when stitching
together my quick proposal.

The problem with the dev->struct_mutex lock is that it's everywhere
and hence it's way to easy to create deadlocks with it. Especially
with prime, where a simple rule like "take this lock first, always"
are suddenly really more complicated since gem drivers can assume both
the importer and exporter role ... I haven't done a full locking
review, but I expect current prime to deadlock (through driver calls)
when userspace starts doing funny things.

So imo it's best to move dev->struct_mutex as far away as possible
from helper functions and think really hard whether we really need it.
I've noticed three functions:

drm_gem_map_dma_buf: We can remove the locking here imo, if drivers
need dev->struct_mutex for internal consistency, they can take it in
their callbacks. The dma_buf_attachment is very much like a fancy sg
list + backing storage pointer. If multiple callers concurrently
interact with the same attachment, havoc might ensue. Importers should
add their own locking if concurrent access is possible (e.g. similar
to how filling in the same sg table concurrently and then calling
dma_map_sg concurrently would lead to chaos). Otoh creating/destroying
attachments with attach/detach is protected by the dma_buf->lock.

I've checked dma-buf docs, and that this is not specced in the docs,
but was very much the intention. So I think we can solve this with a
simple doc update ;-)

drm_gem_dmabuf_v(un)map: Beside that drivers might need to lock for
internal consistency the lock only protects the vmapping_counter and
pointer and avoids that we race concurrent vmap/vunmap calls to the
driver. Now vmap/vunmap is very much like an attachment, just that we
don't have an explicit struct for each client since there's only one
cpu address space.

So pretty much every driver is forced to reinvent vmap refcounting,
which feels like an interface bug. What about moving this code to the
dma-buf.c interface shim and protecting it with dma_buf->lock? Atm we
only take that lock for attach/detach, and drivers using vmap place
the vmap call at the same spot hw drivers would place the attach call,
so this shouldn't introduce any nefarious locking issues. So I think
this would be the right way to move forward.

And with that we've weaned the prime helpers off their dependency of
dev->struct_mutex, which should make them a notch more flexible and so
useful (since fighting locking inversions is a royal pain best
avoided).

Comments?

Cheers, Daniel

Aaron Plattner Dec. 7, 2012, 10:07 p.m. UTC | #8

On 12/07/2012 01:38 PM, Daniel Vetter wrote:
> On Fri, Dec 7, 2012 at 9:33 PM, Aaron Plattner <aplattner@nvidia.com> wrote:
>> Ah, I see what you mean.  This would also need either a pointer to the drv
>> directly so the helpers can lock drv->struct_mutex, or a helper function to
>> convert from a drm_prime_helper_obj* to a gem_buffer_object* in a
>> driver-specific way since the helper doesn't know the layout of the driver's
>> bo structure.
>
> Ah, locking, and it involves dev->struct_mutex. Another pet peeve of
> mine ;-) Tbh I haven't looked at locking issues when stitching
> together my quick proposal.
>
> The problem with the dev->struct_mutex lock is that it's everywhere
> and hence it's way to easy to create deadlocks with it. Especially
> with prime, where a simple rule like "take this lock first, always"
> are suddenly really more complicated since gem drivers can assume both
> the importer and exporter role ... I haven't done a full locking
> review, but I expect current prime to deadlock (through driver calls)
> when userspace starts doing funny things.
>
> So imo it's best to move dev->struct_mutex as far away as possible
> from helper functions and think really hard whether we really need it.
> I've noticed three functions:
>
> drm_gem_map_dma_buf: We can remove the locking here imo, if drivers
> need dev->struct_mutex for internal consistency, they can take it in
> their callbacks. The dma_buf_attachment is very much like a fancy sg
> list + backing storage pointer. If multiple callers concurrently
> interact with the same attachment, havoc might ensue. Importers should
> add their own locking if concurrent access is possible (e.g. similar
> to how filling in the same sg table concurrently and then calling
> dma_map_sg concurrently would lead to chaos). Otoh creating/destroying
> attachments with attach/detach is protected by the dma_buf->lock.
>
> I've checked dma-buf docs, and that this is not specced in the docs,
> but was very much the intention. So I think we can solve this with a
> simple doc update ;-)
>
> drm_gem_dmabuf_v(un)map: Beside that drivers might need to lock for
> internal consistency the lock only protects the vmapping_counter and
> pointer and avoids that we race concurrent vmap/vunmap calls to the
> driver. Now vmap/vunmap is very much like an attachment, just that we
> don't have an explicit struct for each client since there's only one
> cpu address space.
>
> So pretty much every driver is forced to reinvent vmap refcounting,
> which feels like an interface bug. What about moving this code to the
> dma-buf.c interface shim and protecting it with dma_buf->lock? Atm we
> only take that lock for attach/detach, and drivers using vmap place
> the vmap call at the same spot hw drivers would place the attach call,
> so this shouldn't introduce any nefarious locking issues. So I think
> this would be the right way to move forward.
>
> And with that we've weaned the prime helpers off their dependency of
> dev->struct_mutex, which should make them a notch more flexible and so
> useful (since fighting locking inversions is a royal pain best
> avoided).
>
> Comments?

This all sounds good, but it's getting well outside the realm of what 
I'm comfortable doing in the short term as a kernel noob.  Would you 
object to going with a version of these changes that leaves the new 
hooks where they are now and then applying these suggestions later?  I 
don't think the risk of the helpers turning into a required mid-layer is 
very high given that i915 won't use them and it's one of the most-tested 
drivers.

--
Aaron

[1/4] drm: add prime helpers

Commit Message

Comments

Patch