diff mbox series

[v4,7/8] drm/v3d: Use gemfs/THP in BO creation if available

Message ID 20240428124426.309096-8-mcanal@igalia.com (mailing list archive)
State New, archived
Headers show
Series drm/v3d: Enable Big and Super Pages | expand

Commit Message

Maíra Canal April 28, 2024, 12:40 p.m. UTC
Although Big/Super Pages could appear naturally, it would be quite hard
to have 1MB or 64KB allocated contiguously naturally. Therefore, we can
force the creation of large pages allocated contiguously by using a
mountpoint with "huge=within_size" enabled.

Therefore, as V3D has a mountpoint with "huge=within_size" (if user has
THP enabled), use this mountpoint for BO creation if available. This
will allow us to create large pages allocated contiguously and make use
of Big/Super Pages.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
---
 drivers/gpu/drm/v3d/v3d_bo.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

Comments

Iago Toral April 29, 2024, 5:22 a.m. UTC | #1
Hi Maíra, 

a question below:

El dom, 28-04-2024 a las 09:40 -0300, Maíra Canal escribió:
> Although Big/Super Pages could appear naturally, it would be quite
> hard
> to have 1MB or 64KB allocated contiguously naturally. Therefore, we
> can
> force the creation of large pages allocated contiguously by using a
> mountpoint with "huge=within_size" enabled.
> 
> Therefore, as V3D has a mountpoint with "huge=within_size" (if user
> has
> THP enabled), use this mountpoint for BO creation if available. This
> will allow us to create large pages allocated contiguously and make
> use
> of Big/Super Pages.
> 
> Signed-off-by: Maíra Canal <mcanal@igalia.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> ---
> 

(...)

> @@ -130,10 +140,17 @@ struct v3d_bo *v3d_bo_create(struct drm_device
> *dev, struct drm_file *file_priv,
>  			     size_t unaligned_size)
>  {
>  	struct drm_gem_shmem_object *shmem_obj;
> +	struct v3d_dev *v3d = to_v3d_dev(dev);
>  	struct v3d_bo *bo;
>  	int ret;
>  
> -	shmem_obj = drm_gem_shmem_create(dev, unaligned_size);
> +	/* Let the user opt out of allocating the BOs with THP */
> +	if (v3d->gemfs)
> +		shmem_obj = drm_gem_shmem_create_with_mnt(dev,
> unaligned_size,
> +							  v3d-
> >gemfs);
> +	else
> +		shmem_obj = drm_gem_shmem_create(dev,
> unaligned_size);
> +
>  	if (IS_ERR(shmem_obj))
>  		return ERR_CAST(shmem_obj);
>  	bo = to_v3d_bo(&shmem_obj->base);


if I read this correctly when we have THP we always allocate with that,
Even objects that are smaller than 64KB. I was wondering if there is
any benefit/downside to this or if the behavior for small allocations
is the same we had without the new mount point.

Iago
Maíra Canal April 29, 2024, 4:52 p.m. UTC | #2
Hi Iago,

On 4/29/24 02:22, Iago Toral wrote:
> Hi Maíra,
> 
> a question below:
> 
> El dom, 28-04-2024 a las 09:40 -0300, Maíra Canal escribió:
>> Although Big/Super Pages could appear naturally, it would be quite
>> hard
>> to have 1MB or 64KB allocated contiguously naturally. Therefore, we
>> can
>> force the creation of large pages allocated contiguously by using a
>> mountpoint with "huge=within_size" enabled.
>>
>> Therefore, as V3D has a mountpoint with "huge=within_size" (if user
>> has
>> THP enabled), use this mountpoint for BO creation if available. This
>> will allow us to create large pages allocated contiguously and make
>> use
>> of Big/Super Pages.
>>
>> Signed-off-by: Maíra Canal <mcanal@igalia.com>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> ---
>>
> 
> (...)
> 
>> @@ -130,10 +140,17 @@ struct v3d_bo *v3d_bo_create(struct drm_device
>> *dev, struct drm_file *file_priv,
>>   			     size_t unaligned_size)
>>   {
>>   	struct drm_gem_shmem_object *shmem_obj;
>> +	struct v3d_dev *v3d = to_v3d_dev(dev);
>>   	struct v3d_bo *bo;
>>   	int ret;
>>   
>> -	shmem_obj = drm_gem_shmem_create(dev, unaligned_size);
>> +	/* Let the user opt out of allocating the BOs with THP */
>> +	if (v3d->gemfs)
>> +		shmem_obj = drm_gem_shmem_create_with_mnt(dev,
>> unaligned_size,
>> +							  v3d-
>>> gemfs);
>> +	else
>> +		shmem_obj = drm_gem_shmem_create(dev,
>> unaligned_size);
>> +
>>   	if (IS_ERR(shmem_obj))
>>   		return ERR_CAST(shmem_obj);
>>   	bo = to_v3d_bo(&shmem_obj->base);
> 
> 
> if I read this correctly when we have THP we always allocate with that,
> Even objects that are smaller than 64KB. I was wondering if there is
> any benefit/downside to this or if the behavior for small allocations
> is the same we had without the new mount point.

I'm assuming that your concern is related to memory pressure and memory
fragmentation.

As we are using `huge=within_size`, we only allocate a huge page if it
will be fully within `i_size`. When using `huge=within_size`, we can
optimize the performance for smaller files without forcing larger files
to also use huge pages. I don't understand `huge=within_size` in full
details, but it is possible to check that it is able to avoid the system
(even the RPi) to go OOM. Although it is slightly less performant than
`huge=always` (used in v1), I believe it is more ideal for a system such
as the RPi due to the memory requirements.

Best Regards,
- Maíra

> 
> Iago
diff mbox series

Patch

diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c
index 79e31c5299b1..16ac26c31c6b 100644
--- a/drivers/gpu/drm/v3d/v3d_bo.c
+++ b/drivers/gpu/drm/v3d/v3d_bo.c
@@ -94,6 +94,7 @@  v3d_bo_create_finish(struct drm_gem_object *obj)
 	struct v3d_dev *v3d = to_v3d_dev(obj->dev);
 	struct v3d_bo *bo = to_v3d_bo(obj);
 	struct sg_table *sgt;
+	u64 align;
 	int ret;
 
 	/* So far we pin the BO in the MMU for its lifetime, so use
@@ -103,6 +104,15 @@  v3d_bo_create_finish(struct drm_gem_object *obj)
 	if (IS_ERR(sgt))
 		return PTR_ERR(sgt);
 
+	if (!v3d->gemfs)
+		align = SZ_4K;
+	else if (obj->size >= SZ_1M)
+		align = SZ_1M;
+	else if (obj->size >= SZ_64K)
+		align = SZ_64K;
+	else
+		align = SZ_4K;
+
 	spin_lock(&v3d->mm_lock);
 	/* Allocate the object's space in the GPU's page tables.
 	 * Inserting PTEs will happen later, but the offset is for the
@@ -110,7 +120,7 @@  v3d_bo_create_finish(struct drm_gem_object *obj)
 	 */
 	ret = drm_mm_insert_node_generic(&v3d->mm, &bo->node,
 					 obj->size >> V3D_MMU_PAGE_SHIFT,
-					 SZ_4K >> V3D_MMU_PAGE_SHIFT, 0, 0);
+					 align >> V3D_MMU_PAGE_SHIFT, 0, 0);
 	spin_unlock(&v3d->mm_lock);
 	if (ret)
 		return ret;
@@ -130,10 +140,17 @@  struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file *file_priv,
 			     size_t unaligned_size)
 {
 	struct drm_gem_shmem_object *shmem_obj;
+	struct v3d_dev *v3d = to_v3d_dev(dev);
 	struct v3d_bo *bo;
 	int ret;
 
-	shmem_obj = drm_gem_shmem_create(dev, unaligned_size);
+	/* Let the user opt out of allocating the BOs with THP */
+	if (v3d->gemfs)
+		shmem_obj = drm_gem_shmem_create_with_mnt(dev, unaligned_size,
+							  v3d->gemfs);
+	else
+		shmem_obj = drm_gem_shmem_create(dev, unaligned_size);
+
 	if (IS_ERR(shmem_obj))
 		return ERR_CAST(shmem_obj);
 	bo = to_v3d_bo(&shmem_obj->base);