mbox series

[v2,0/5] Add option to mmap GEM buffers cached

Message ID 20210307202835.253907-1-paul@crapouillou.net (mailing list archive)
Headers show
Series Add option to mmap GEM buffers cached | expand

Message

Paul Cercueil March 7, 2021, 8:28 p.m. UTC
Rework of my previous patchset which added support for GEM buffers
backed by non-coherent memory to the ingenic-drm driver.

Having GEM buffers backed by non-coherent memory is interesting in
the particular case where it is faster to render to a non-coherent
buffer then sync the data cache, than to render to a write-combine
buffer, and (by extension) much faster than using a shadow buffer.
This is true for instance on some Ingenic SoCs, where even simple
blits (e.g. memcpy) are about three times faster using this method.

For the record, the previous patchset was accepted for 5.10 then had
to be reverted, as it conflicted with some changes made to the DMA API.

This new patchset is pretty different as it adds the functionality to
the DRM core. The first three patches add variants to existing functions
but with the "non-coherent memory" twist, exported as GPL symbols. The
fourth patch adds a function to be used with the damage helpers.
Finally, the last patch adds support for non-coherent GEM buffers to the
ingenic-drm driver. The functionality is enabled through a module
parameter, and is disabled by default.

Cheers,
-Paul

Paul Cercueil (5):
  drm: Add and export function drm_gem_cma_create_noncoherent
  drm: Add and export function drm_gem_cma_dumb_create_noncoherent
  drm: Add and export function drm_gem_cma_mmap_noncoherent
  drm: Add and export function drm_gem_cma_sync_data
  drm/ingenic: Add option to alloc cached GEM buffers

 drivers/gpu/drm/drm_gem_cma_helper.c      | 223 +++++++++++++++++++---
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c |  49 ++++-
 drivers/gpu/drm/ingenic/ingenic-drm.h     |   4 +
 drivers/gpu/drm/ingenic/ingenic-ipu.c     |  14 +-
 include/drm/drm_gem_cma_helper.h          |  13 ++
 5 files changed, 273 insertions(+), 30 deletions(-)

Comments

Thomas Zimmermann March 8, 2021, 8:41 a.m. UTC | #1
Hi Paul,

having individual functions for each mode only makes sense if the 
decision is at compile time. But in patch 5, you're working around your 
earlier design by introducing in-driver helpers that select the correct 
CMA function.

In SHMEM helpers we have the flag map_wc in the GEM structure that 
selects the pages caching mode (wc vs uncached). I think CMA should use 
this design as well. Have a map_noncoherent flag in the CMA GEM object 
and set it from the driver's implementation of gem_create_object.

And in the long run, we could try to consolidate all drivers/helpers 
mapping flags in struct drm_gem_object.

Best regards
Thomas

Am 07.03.21 um 21:28 schrieb Paul Cercueil:
> Rework of my previous patchset which added support for GEM buffers
> backed by non-coherent memory to the ingenic-drm driver.
> 
> Having GEM buffers backed by non-coherent memory is interesting in
> the particular case where it is faster to render to a non-coherent
> buffer then sync the data cache, than to render to a write-combine
> buffer, and (by extension) much faster than using a shadow buffer.
> This is true for instance on some Ingenic SoCs, where even simple
> blits (e.g. memcpy) are about three times faster using this method.
> 
> For the record, the previous patchset was accepted for 5.10 then had
> to be reverted, as it conflicted with some changes made to the DMA API.
> 
> This new patchset is pretty different as it adds the functionality to
> the DRM core. The first three patches add variants to existing functions
> but with the "non-coherent memory" twist, exported as GPL symbols. The
> fourth patch adds a function to be used with the damage helpers.
> Finally, the last patch adds support for non-coherent GEM buffers to the
> ingenic-drm driver. The functionality is enabled through a module
> parameter, and is disabled by default.
> 
> Cheers,
> -Paul
> 
> Paul Cercueil (5):
>    drm: Add and export function drm_gem_cma_create_noncoherent
>    drm: Add and export function drm_gem_cma_dumb_create_noncoherent
>    drm: Add and export function drm_gem_cma_mmap_noncoherent
>    drm: Add and export function drm_gem_cma_sync_data
>    drm/ingenic: Add option to alloc cached GEM buffers
> 
>   drivers/gpu/drm/drm_gem_cma_helper.c      | 223 +++++++++++++++++++---
>   drivers/gpu/drm/ingenic/ingenic-drm-drv.c |  49 ++++-
>   drivers/gpu/drm/ingenic/ingenic-drm.h     |   4 +
>   drivers/gpu/drm/ingenic/ingenic-ipu.c     |  14 +-
>   include/drm/drm_gem_cma_helper.h          |  13 ++
>   5 files changed, 273 insertions(+), 30 deletions(-)
>
Paul Cercueil March 10, 2021, 7:02 p.m. UTC | #2
Hi Thomas,

Le lun. 8 mars 2021 à 9:41, Thomas Zimmermann <tzimmermann@suse.de> a 
écrit :
> Hi Paul,
> 
> having individual functions for each mode only makes sense if the 
> decision is at compile time. But in patch 5, you're working around 
> your earlier design by introducing in-driver helpers that select the 
> correct CMA function.
> 
> In SHMEM helpers we have the flag map_wc in the GEM structure that 
> selects the pages caching mode (wc vs uncached). I think CMA should 
> use this design as well. Have a map_noncoherent flag in the CMA GEM 
> object and set it from the driver's implementation of 
> gem_create_object.

Is that a new addition? That severely reduces the patchset size, which 
is perfect.

I'll send a V3 then.

Cheers,
-Paul

> And in the long run, we could try to consolidate all drivers/helpers 
> mapping flags in struct drm_gem_object.
> 
> Best regards
> Thomas
> 
> Am 07.03.21 um 21:28 schrieb Paul Cercueil:
>> Rework of my previous patchset which added support for GEM buffers
>> backed by non-coherent memory to the ingenic-drm driver.
>> 
>> Having GEM buffers backed by non-coherent memory is interesting in
>> the particular case where it is faster to render to a non-coherent
>> buffer then sync the data cache, than to render to a write-combine
>> buffer, and (by extension) much faster than using a shadow buffer.
>> This is true for instance on some Ingenic SoCs, where even simple
>> blits (e.g. memcpy) are about three times faster using this method.
>> 
>> For the record, the previous patchset was accepted for 5.10 then had
>> to be reverted, as it conflicted with some changes made to the DMA 
>> API.
>> 
>> This new patchset is pretty different as it adds the functionality to
>> the DRM core. The first three patches add variants to existing 
>> functions
>> but with the "non-coherent memory" twist, exported as GPL symbols. 
>> The
>> fourth patch adds a function to be used with the damage helpers.
>> Finally, the last patch adds support for non-coherent GEM buffers to 
>> the
>> ingenic-drm driver. The functionality is enabled through a module
>> parameter, and is disabled by default.
>> 
>> Cheers,
>> -Paul
>> 
>> Paul Cercueil (5):
>>    drm: Add and export function drm_gem_cma_create_noncoherent
>>    drm: Add and export function drm_gem_cma_dumb_create_noncoherent
>>    drm: Add and export function drm_gem_cma_mmap_noncoherent
>>    drm: Add and export function drm_gem_cma_sync_data
>>    drm/ingenic: Add option to alloc cached GEM buffers
>> 
>>   drivers/gpu/drm/drm_gem_cma_helper.c      | 223 
>> +++++++++++++++++++---
>>   drivers/gpu/drm/ingenic/ingenic-drm-drv.c |  49 ++++-
>>   drivers/gpu/drm/ingenic/ingenic-drm.h     |   4 +
>>   drivers/gpu/drm/ingenic/ingenic-ipu.c     |  14 +-
>>   include/drm/drm_gem_cma_helper.h          |  13 ++
>>   5 files changed, 273 insertions(+), 30 deletions(-)
>> 
> 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>
Thomas Zimmermann March 11, 2021, 7:52 a.m. UTC | #3
Hi

Am 10.03.21 um 20:02 schrieb Paul Cercueil:
> Hi Thomas,
> 
> Le lun. 8 mars 2021 à 9:41, Thomas Zimmermann <tzimmermann@suse.de> a 
> écrit :
>> Hi Paul,
>>
>> having individual functions for each mode only makes sense if the 
>> decision is at compile time. But in patch 5, you're working around 
>> your earlier design by introducing in-driver helpers that select the 
>> correct CMA function.
>>
>> In SHMEM helpers we have the flag map_wc in the GEM structure that 
>> selects the pages caching mode (wc vs uncached). I think CMA should 

Re-reading this, it should rather be WC and cached.

>> use this design as well. Have a map_noncoherent flag in the CMA GEM 
>> object and set it from the driver's implementation of gem_create_object.
> 
> Is that a new addition? That severely reduces the patchset size, which 
> is perfect.

It was added a few months ago, around the time you send the first 
version of the patches at hand.

Originally, SHMEM uses write combining by default. And several drivers 
used default mapping flags instead (so usually cached). IIRC I 
streamlined everything and flipped the default. Most drivers can use 
cached mappings and only some require WC.

Recently there was also a patchset that added support for cached video 
RAM (or something like that). So at some point we could store these 
flags in drm_gem_object. For now, I'd just put a flag into 
drm_gem_cma_object.

> 
> I'll send a V3 then.

Great!

Best regards
Thomas

> 
> Cheers,
> -Paul
> 
>> And in the long run, we could try to consolidate all drivers/helpers 
>> mapping flags in struct drm_gem_object.
>>
>> Best regards
>> Thomas
>>
>> Am 07.03.21 um 21:28 schrieb Paul Cercueil:
>>> Rework of my previous patchset which added support for GEM buffers
>>> backed by non-coherent memory to the ingenic-drm driver.
>>>
>>> Having GEM buffers backed by non-coherent memory is interesting in
>>> the particular case where it is faster to render to a non-coherent
>>> buffer then sync the data cache, than to render to a write-combine
>>> buffer, and (by extension) much faster than using a shadow buffer.
>>> This is true for instance on some Ingenic SoCs, where even simple
>>> blits (e.g. memcpy) are about three times faster using this method.
>>>
>>> For the record, the previous patchset was accepted for 5.10 then had
>>> to be reverted, as it conflicted with some changes made to the DMA API.
>>>
>>> This new patchset is pretty different as it adds the functionality to
>>> the DRM core. The first three patches add variants to existing functions
>>> but with the "non-coherent memory" twist, exported as GPL symbols. The
>>> fourth patch adds a function to be used with the damage helpers.
>>> Finally, the last patch adds support for non-coherent GEM buffers to the
>>> ingenic-drm driver. The functionality is enabled through a module
>>> parameter, and is disabled by default.
>>>
>>> Cheers,
>>> -Paul
>>>
>>> Paul Cercueil (5):
>>>    drm: Add and export function drm_gem_cma_create_noncoherent
>>>    drm: Add and export function drm_gem_cma_dumb_create_noncoherent
>>>    drm: Add and export function drm_gem_cma_mmap_noncoherent
>>>    drm: Add and export function drm_gem_cma_sync_data
>>>    drm/ingenic: Add option to alloc cached GEM buffers
>>>
>>>   drivers/gpu/drm/drm_gem_cma_helper.c      | 223 +++++++++++++++++++---
>>>   drivers/gpu/drm/ingenic/ingenic-drm-drv.c |  49 ++++-
>>>   drivers/gpu/drm/ingenic/ingenic-drm.h     |   4 +
>>>   drivers/gpu/drm/ingenic/ingenic-ipu.c     |  14 +-
>>>   include/drm/drm_gem_cma_helper.h          |  13 ++
>>>   5 files changed, 273 insertions(+), 30 deletions(-)
>>>
>>
>> -- 
>> Thomas Zimmermann
>> Graphics Driver Developer
>> SUSE Software Solutions Germany GmbH
>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>> (HRB 36809, AG Nürnberg)
>> Geschäftsführer: Felix Imendörffer
>>
> 
>