mbox series

[0/2] iommu: Allow passing custom allocators to pgtable drivers

Message ID 20230809121744.2341454-1-boris.brezillon@collabora.com (mailing list archive)
Headers show
Series iommu: Allow passing custom allocators to pgtable drivers | expand

Message

Boris Brezillon Aug. 9, 2023, 12:17 p.m. UTC
Hello,

This patchset is an attempt at making page table allocation
customizable. This is useful to some GPU drivers for various reasons:

- speed-up upcoming page table allocations by managing a pool of free
  pages
- batch page table allocation instead of allocating one page at a time
- pre-reserve pages for page tables needed for map/unmap operations and
  return the unused page tables to some pool

The first and last reasons are particularly important for GPU drivers
wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND requires
that any page table needed for a map/unmap operation to succeed be
allocated at VM_BIND job creation time. At the time of the job creation,
we don't know what the VM will look like when we get to execute the
map/unmap, and can't guess how many page tables we will need. Because
of that, we have to over-provision page tables for the worst case
scenario (page table tree is empty), which means we will allocate/free
a lot. Having pool a pool of free pages is crucial if we want to
speed-up VM_BIND requests.

A real example of how such custom allocators can be used is available
here[1]. v2 of the Panthor driver is approaching submission, and I
figured I'd try to upstream the dependencies separately, which is
why I submit this series now, even though the user of this new API
will come afterwards. If you'd prefer to have those patches submitted
along with the Panthor driver, let me know.

This approach has been discussed with Robin, and is hopefully not too
far from what he had in mind.

Regards,

Boris

[1]https://gitlab.freedesktop.org/panfrost/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_mmu.c#L441

Boris Brezillon (2):
  iommu: Allow passing custom allocators to pgtable drivers
  iommu: Extend LPAE page table format to support custom allocators

 drivers/iommu/io-pgtable-arm.c | 50 +++++++++++++++++++++++-----------
 drivers/iommu/io-pgtable.c     | 31 +++++++++++++++++++++
 include/linux/io-pgtable.h     | 21 ++++++++++++++
 3 files changed, 86 insertions(+), 16 deletions(-)

Comments

Steven Price Sept. 20, 2023, 1:12 p.m. UTC | #1
On 09/08/2023 13:17, Boris Brezillon wrote:
> Hello,
> 
> This patchset is an attempt at making page table allocation
> customizable. This is useful to some GPU drivers for various reasons:
> 
> - speed-up upcoming page table allocations by managing a pool of free
>   pages
> - batch page table allocation instead of allocating one page at a time
> - pre-reserve pages for page tables needed for map/unmap operations and
>   return the unused page tables to some pool
> 
> The first and last reasons are particularly important for GPU drivers
> wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND requires
> that any page table needed for a map/unmap operation to succeed be
> allocated at VM_BIND job creation time. At the time of the job creation,
> we don't know what the VM will look like when we get to execute the
> map/unmap, and can't guess how many page tables we will need. Because
> of that, we have to over-provision page tables for the worst case
> scenario (page table tree is empty), which means we will allocate/free
> a lot. Having pool a pool of free pages is crucial if we want to
> speed-up VM_BIND requests.
> 
> A real example of how such custom allocators can be used is available
> here[1]. v2 of the Panthor driver is approaching submission, and I
> figured I'd try to upstream the dependencies separately, which is
> why I submit this series now, even though the user of this new API
> will come afterwards. If you'd prefer to have those patches submitted
> along with the Panthor driver, let me know.
> 
> This approach has been discussed with Robin, and is hopefully not too
> far from what he had in mind.

The alternative would be to embed a cache of pages into the IOMMU
framework, however kmem_cache sadly doesn't seem to support the
'reserve' of pages concept that we need. mempools could be a solution
but the mempool would need to be created by the IOMMU framework as the
alloc/free functions are specified when creating the pool. So it would
be a much bigger change (to drivers/iommu).

So, given that so far it's just Panthor this seems like the right
approach for now - when/if other drivers want the same functionality
then it might make sense to revisit the idea of doing the caching within
the IOMMU framework.

Robin: Does this approach sound sensible?

FWIW:

Reviewed-by: Steven Price <steven.price@arm.com>

Steve

> Regards,
> 
> Boris
> 
> [1]https://gitlab.freedesktop.org/panfrost/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_mmu.c#L441
> 
> Boris Brezillon (2):
>   iommu: Allow passing custom allocators to pgtable drivers
>   iommu: Extend LPAE page table format to support custom allocators
> 
>  drivers/iommu/io-pgtable-arm.c | 50 +++++++++++++++++++++++-----------
>  drivers/iommu/io-pgtable.c     | 31 +++++++++++++++++++++
>  include/linux/io-pgtable.h     | 21 ++++++++++++++
>  3 files changed, 86 insertions(+), 16 deletions(-)
>
Rob Clark Oct. 23, 2023, 9:02 p.m. UTC | #2
On Wed, Sep 20, 2023 at 6:12 AM Steven Price <steven.price@arm.com> wrote:
>
> On 09/08/2023 13:17, Boris Brezillon wrote:
> > Hello,
> >
> > This patchset is an attempt at making page table allocation
> > customizable. This is useful to some GPU drivers for various reasons:
> >
> > - speed-up upcoming page table allocations by managing a pool of free
> >   pages
> > - batch page table allocation instead of allocating one page at a time
> > - pre-reserve pages for page tables needed for map/unmap operations and
> >   return the unused page tables to some pool
> >
> > The first and last reasons are particularly important for GPU drivers
> > wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND requires
> > that any page table needed for a map/unmap operation to succeed be
> > allocated at VM_BIND job creation time. At the time of the job creation,
> > we don't know what the VM will look like when we get to execute the
> > map/unmap, and can't guess how many page tables we will need. Because
> > of that, we have to over-provision page tables for the worst case
> > scenario (page table tree is empty), which means we will allocate/free
> > a lot. Having pool a pool of free pages is crucial if we want to
> > speed-up VM_BIND requests.
> >
> > A real example of how such custom allocators can be used is available
> > here[1]. v2 of the Panthor driver is approaching submission, and I
> > figured I'd try to upstream the dependencies separately, which is
> > why I submit this series now, even though the user of this new API
> > will come afterwards. If you'd prefer to have those patches submitted
> > along with the Panthor driver, let me know.
> >
> > This approach has been discussed with Robin, and is hopefully not too
> > far from what he had in mind.
>
> The alternative would be to embed a cache of pages into the IOMMU
> framework, however kmem_cache sadly doesn't seem to support the
> 'reserve' of pages concept that we need. mempools could be a solution
> but the mempool would need to be created by the IOMMU framework as the
> alloc/free functions are specified when creating the pool. So it would
> be a much bigger change (to drivers/iommu).
>
> So, given that so far it's just Panthor this seems like the right
> approach for now - when/if other drivers want the same functionality
> then it might make sense to revisit the idea of doing the caching within
> the IOMMU framework.

I have some plans to use this as well for drm/msm.. but the reasons
and requirements are basically the same as for panthor.  I think I
prefer the custom allocator approach, rather than tying this to IOMMU
framework.  (But ofc custom allocators, I guess, does not prevent the
iommu driver from doing it's own caching.)

BR,
-R

> Robin: Does this approach sound sensible?
>
> FWIW:
>
> Reviewed-by: Steven Price <steven.price@arm.com>
>
> Steve
>
> > Regards,
> >
> > Boris
> >
> > [1]https://gitlab.freedesktop.org/panfrost/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_mmu.c#L441
> >
> > Boris Brezillon (2):
> >   iommu: Allow passing custom allocators to pgtable drivers
> >   iommu: Extend LPAE page table format to support custom allocators
> >
> >  drivers/iommu/io-pgtable-arm.c | 50 +++++++++++++++++++++++-----------
> >  drivers/iommu/io-pgtable.c     | 31 +++++++++++++++++++++
> >  include/linux/io-pgtable.h     | 21 ++++++++++++++
> >  3 files changed, 86 insertions(+), 16 deletions(-)
> >
>
Gaurav Kohli Nov. 7, 2023, 11:52 a.m. UTC | #3
On 10/24/2023 2:32 AM, Rob Clark wrote:
> On Wed, Sep 20, 2023 at 6:12 AM Steven Price <steven.price@arm.com> wrote:
>>
>> On 09/08/2023 13:17, Boris Brezillon wrote:
>>> Hello,
>>>
>>> This patchset is an attempt at making page table allocation
>>> customizable. This is useful to some GPU drivers for various reasons:
>>>
>>> - speed-up upcoming page table allocations by managing a pool of free
>>>    pages
>>> - batch page table allocation instead of allocating one page at a time
>>> - pre-reserve pages for page tables needed for map/unmap operations and
>>>    return the unused page tables to some pool
>>>
>>> The first and last reasons are particularly important for GPU drivers
>>> wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND requires
>>> that any page table needed for a map/unmap operation to succeed be
>>> allocated at VM_BIND job creation time. At the time of the job creation,
>>> we don't know what the VM will look like when we get to execute the
>>> map/unmap, and can't guess how many page tables we will need. Because
>>> of that, we have to over-provision page tables for the worst case
>>> scenario (page table tree is empty), which means we will allocate/free
>>> a lot. Having pool a pool of free pages is crucial if we want to
>>> speed-up VM_BIND requests.
>>>
>>> A real example of how such custom allocators can be used is available
>>> here[1]. v2 of the Panthor driver is approaching submission, and I
>>> figured I'd try to upstream the dependencies separately, which is
>>> why I submit this series now, even though the user of this new API
>>> will come afterwards. If you'd prefer to have those patches submitted
>>> along with the Panthor driver, let me know.
>>>
>>> This approach has been discussed with Robin, and is hopefully not too
>>> far from what he had in mind.
>>
>> The alternative would be to embed a cache of pages into the IOMMU
>> framework, however kmem_cache sadly doesn't seem to support the
>> 'reserve' of pages concept that we need. mempools could be a solution
>> but the mempool would need to be created by the IOMMU framework as the
>> alloc/free functions are specified when creating the pool. So it would
>> be a much bigger change (to drivers/iommu).
>>
>> So, given that so far it's just Panthor this seems like the right
>> approach for now - when/if other drivers want the same functionality
>> then it might make sense to revisit the idea of doing the caching within
>> the IOMMU framework.
> 
> I have some plans to use this as well for drm/msm.. but the reasons
> and requirements are basically the same as for panthor.  I think I
> prefer the custom allocator approach, rather than tying this to IOMMU
> framework.  (But ofc custom allocators, I guess, does not prevent the
> iommu driver from doing it's own caching.)
> 
> BR,
> -R
> 

We have also posted one RFC patch series which is based on this current 
patches by Boris and helping us to define our custom alloc and free 
pgtable call. For our side usecase we have a requirement to create 
pgtable from HLOS and then share it to different entity(VMID) and 
basically that also requires few smc calls and for that we need
custom alloc/free callbacks.

https://lore.kernel.org/all/20231101071144.16309-1-quic_gkohli@quicinc.com/


So custom allocator and free ops is helping for us also. Is there any 
plan to merge these patches from Boris.




>> Robin: Does this approach sound sensible?
>>
>> FWIW:
>>
>> Reviewed-by: Steven Price <steven.price@arm.com>
>>
>> Steve
>>
>>> Regards,
>>>
>>> Boris
>>>
>>> [1]https://gitlab.freedesktop.org/panfrost/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_mmu.c#L441
>>>
>>> Boris Brezillon (2):
>>>    iommu: Allow passing custom allocators to pgtable drivers
>>>    iommu: Extend LPAE page table format to support custom allocators
>>>
>>>   drivers/iommu/io-pgtable-arm.c | 50 +++++++++++++++++++++++-----------
>>>   drivers/iommu/io-pgtable.c     | 31 +++++++++++++++++++++
>>>   include/linux/io-pgtable.h     | 21 ++++++++++++++
>>>   3 files changed, 86 insertions(+), 16 deletions(-)
>>>
>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Gaurav Kohli Nov. 7, 2023, 12:01 p.m. UTC | #4
On 11/7/2023 5:22 PM, Gaurav Kohli wrote:
> 
> 
> On 10/24/2023 2:32 AM, Rob Clark wrote:
>> On Wed, Sep 20, 2023 at 6:12 AM Steven Price <steven.price@arm.com> 
>> wrote:
>>>
>>> On 09/08/2023 13:17, Boris Brezillon wrote:
>>>> Hello,
>>>>
>>>> This patchset is an attempt at making page table allocation
>>>> customizable. This is useful to some GPU drivers for various reasons:
>>>>
>>>> - speed-up upcoming page table allocations by managing a pool of free
>>>>    pages
>>>> - batch page table allocation instead of allocating one page at a time
>>>> - pre-reserve pages for page tables needed for map/unmap operations and
>>>>    return the unused page tables to some pool
>>>>
>>>> The first and last reasons are particularly important for GPU drivers
>>>> wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND 
>>>> requires
>>>> that any page table needed for a map/unmap operation to succeed be
>>>> allocated at VM_BIND job creation time. At the time of the job 
>>>> creation,
>>>> we don't know what the VM will look like when we get to execute the
>>>> map/unmap, and can't guess how many page tables we will need. Because
>>>> of that, we have to over-provision page tables for the worst case
>>>> scenario (page table tree is empty), which means we will allocate/free
>>>> a lot. Having pool a pool of free pages is crucial if we want to
>>>> speed-up VM_BIND requests.
>>>>
>>>> A real example of how such custom allocators can be used is available
>>>> here[1]. v2 of the Panthor driver is approaching submission, and I
>>>> figured I'd try to upstream the dependencies separately, which is
>>>> why I submit this series now, even though the user of this new API
>>>> will come afterwards. If you'd prefer to have those patches submitted
>>>> along with the Panthor driver, let me know.
>>>>
>>>> This approach has been discussed with Robin, and is hopefully not too
>>>> far from what he had in mind.
>>>
>>> The alternative would be to embed a cache of pages into the IOMMU
>>> framework, however kmem_cache sadly doesn't seem to support the
>>> 'reserve' of pages concept that we need. mempools could be a solution
>>> but the mempool would need to be created by the IOMMU framework as the
>>> alloc/free functions are specified when creating the pool. So it would
>>> be a much bigger change (to drivers/iommu).
>>>
>>> So, given that so far it's just Panthor this seems like the right
>>> approach for now - when/if other drivers want the same functionality
>>> then it might make sense to revisit the idea of doing the caching within
>>> the IOMMU framework.
>>
>> I have some plans to use this as well for drm/msm.. but the reasons
>> and requirements are basically the same as for panthor.  I think I
>> prefer the custom allocator approach, rather than tying this to IOMMU
>> framework.  (But ofc custom allocators, I guess, does not prevent the
>> iommu driver from doing it's own caching.)
>>
>> BR,
>> -R
>>
> 
> We have also posted one RFC patch series which is based on this current 
> patches by Boris and helping us to define our custom alloc and free 
> pgtable call. For our side usecase we have a requirement to create 
> pgtable from HLOS and then share it to different entity(VMID) and 
> basically that also requires few smc calls and for that we need
> custom alloc/free callbacks.
> 
> https://lore.kernel.org/all/20231101071144.16309-1-quic_gkohli@quicinc.com/
> 
> 
> So custom allocator and free ops is helping for us also. Is there any 
> plan to merge these patches from Boris.
> 
> 
> 
> 

Please use below tested by tag, if you are planning to merge patches by 
Boris:

Tested-by: Gaurav Kohli <quic_gkohli@quicinc.com>

>>> Robin: Does this approach sound sensible?
>>>
>>> FWIW:
>>>
>>> Reviewed-by: Steven Price <steven.price@arm.com>
>>>
>>> Steve
>>>
>>>> Regards,
>>>>
>>>> Boris
>>>>
>>>> [1]https://gitlab.freedesktop.org/panfrost/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_mmu.c#L441
>>>>
>>>> Boris Brezillon (2):
>>>>    iommu: Allow passing custom allocators to pgtable drivers
>>>>    iommu: Extend LPAE page table format to support custom allocators
>>>>
>>>>   drivers/iommu/io-pgtable-arm.c | 50 
>>>> +++++++++++++++++++++++-----------
>>>>   drivers/iommu/io-pgtable.c     | 31 +++++++++++++++++++++
>>>>   include/linux/io-pgtable.h     | 21 ++++++++++++++
>>>>   3 files changed, 86 insertions(+), 16 deletions(-)
>>>>
>>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Boris Brezillon Nov. 10, 2023, 9:47 a.m. UTC | #5
Hi Gaurav,

On Tue, 7 Nov 2023 17:22:39 +0530
Gaurav Kohli <quic_gkohli@quicinc.com> wrote:

> On 10/24/2023 2:32 AM, Rob Clark wrote:
> > On Wed, Sep 20, 2023 at 6:12 AM Steven Price <steven.price@arm.com> wrote:  
> >>
> >> On 09/08/2023 13:17, Boris Brezillon wrote:  
> >>> Hello,
> >>>
> >>> This patchset is an attempt at making page table allocation
> >>> customizable. This is useful to some GPU drivers for various reasons:
> >>>
> >>> - speed-up upcoming page table allocations by managing a pool of free
> >>>    pages
> >>> - batch page table allocation instead of allocating one page at a time
> >>> - pre-reserve pages for page tables needed for map/unmap operations and
> >>>    return the unused page tables to some pool
> >>>
> >>> The first and last reasons are particularly important for GPU drivers
> >>> wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND requires
> >>> that any page table needed for a map/unmap operation to succeed be
> >>> allocated at VM_BIND job creation time. At the time of the job creation,
> >>> we don't know what the VM will look like when we get to execute the
> >>> map/unmap, and can't guess how many page tables we will need. Because
> >>> of that, we have to over-provision page tables for the worst case
> >>> scenario (page table tree is empty), which means we will allocate/free
> >>> a lot. Having pool a pool of free pages is crucial if we want to
> >>> speed-up VM_BIND requests.
> >>>
> >>> A real example of how such custom allocators can be used is available
> >>> here[1]. v2 of the Panthor driver is approaching submission, and I
> >>> figured I'd try to upstream the dependencies separately, which is
> >>> why I submit this series now, even though the user of this new API
> >>> will come afterwards. If you'd prefer to have those patches submitted
> >>> along with the Panthor driver, let me know.
> >>>
> >>> This approach has been discussed with Robin, and is hopefully not too
> >>> far from what he had in mind.  
> >>
> >> The alternative would be to embed a cache of pages into the IOMMU
> >> framework, however kmem_cache sadly doesn't seem to support the
> >> 'reserve' of pages concept that we need. mempools could be a solution
> >> but the mempool would need to be created by the IOMMU framework as the
> >> alloc/free functions are specified when creating the pool. So it would
> >> be a much bigger change (to drivers/iommu).
> >>
> >> So, given that so far it's just Panthor this seems like the right
> >> approach for now - when/if other drivers want the same functionality
> >> then it might make sense to revisit the idea of doing the caching within
> >> the IOMMU framework.  
> > 
> > I have some plans to use this as well for drm/msm.. but the reasons
> > and requirements are basically the same as for panthor.  I think I
> > prefer the custom allocator approach, rather than tying this to IOMMU
> > framework.  (But ofc custom allocators, I guess, does not prevent the
> > iommu driver from doing it's own caching.)
> > 
> > BR,
> > -R
> >   
> 
> We have also posted one RFC patch series which is based on this current 
> patches by Boris and helping us to define our custom alloc and free 
> pgtable call. For our side usecase we have a requirement to create 
> pgtable from HLOS and then share it to different entity(VMID) and 
> basically that also requires few smc calls and for that we need
> custom alloc/free callbacks.
> 
> https://lore.kernel.org/all/20231101071144.16309-1-quic_gkohli@quicinc.com/
> 
> 
> So custom allocator and free ops is helping for us also. Is there any 
> plan to merge these patches from Boris.

Sorry for the late reply. I just sent a v2, but I forgot to add your
Tested-by :-/. Feel free to add it back.

Regards,

Boris