Message ID | 20240428124426.309096-8-mcanal@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/v3d: Enable Big and Super Pages | expand |
Hi Maíra, a question below: El dom, 28-04-2024 a las 09:40 -0300, Maíra Canal escribió: > Although Big/Super Pages could appear naturally, it would be quite > hard > to have 1MB or 64KB allocated contiguously naturally. Therefore, we > can > force the creation of large pages allocated contiguously by using a > mountpoint with "huge=within_size" enabled. > > Therefore, as V3D has a mountpoint with "huge=within_size" (if user > has > THP enabled), use this mountpoint for BO creation if available. This > will allow us to create large pages allocated contiguously and make > use > of Big/Super Pages. > > Signed-off-by: Maíra Canal <mcanal@igalia.com> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> > --- > (...) > @@ -130,10 +140,17 @@ struct v3d_bo *v3d_bo_create(struct drm_device > *dev, struct drm_file *file_priv, > size_t unaligned_size) > { > struct drm_gem_shmem_object *shmem_obj; > + struct v3d_dev *v3d = to_v3d_dev(dev); > struct v3d_bo *bo; > int ret; > > - shmem_obj = drm_gem_shmem_create(dev, unaligned_size); > + /* Let the user opt out of allocating the BOs with THP */ > + if (v3d->gemfs) > + shmem_obj = drm_gem_shmem_create_with_mnt(dev, > unaligned_size, > + v3d- > >gemfs); > + else > + shmem_obj = drm_gem_shmem_create(dev, > unaligned_size); > + > if (IS_ERR(shmem_obj)) > return ERR_CAST(shmem_obj); > bo = to_v3d_bo(&shmem_obj->base); if I read this correctly when we have THP we always allocate with that, Even objects that are smaller than 64KB. I was wondering if there is any benefit/downside to this or if the behavior for small allocations is the same we had without the new mount point. Iago
Hi Iago, On 4/29/24 02:22, Iago Toral wrote: > Hi Maíra, > > a question below: > > El dom, 28-04-2024 a las 09:40 -0300, Maíra Canal escribió: >> Although Big/Super Pages could appear naturally, it would be quite >> hard >> to have 1MB or 64KB allocated contiguously naturally. Therefore, we >> can >> force the creation of large pages allocated contiguously by using a >> mountpoint with "huge=within_size" enabled. >> >> Therefore, as V3D has a mountpoint with "huge=within_size" (if user >> has >> THP enabled), use this mountpoint for BO creation if available. This >> will allow us to create large pages allocated contiguously and make >> use >> of Big/Super Pages. >> >> Signed-off-by: Maíra Canal <mcanal@igalia.com> >> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> >> --- >> > > (...) > >> @@ -130,10 +140,17 @@ struct v3d_bo *v3d_bo_create(struct drm_device >> *dev, struct drm_file *file_priv, >> size_t unaligned_size) >> { >> struct drm_gem_shmem_object *shmem_obj; >> + struct v3d_dev *v3d = to_v3d_dev(dev); >> struct v3d_bo *bo; >> int ret; >> >> - shmem_obj = drm_gem_shmem_create(dev, unaligned_size); >> + /* Let the user opt out of allocating the BOs with THP */ >> + if (v3d->gemfs) >> + shmem_obj = drm_gem_shmem_create_with_mnt(dev, >> unaligned_size, >> + v3d- >>> gemfs); >> + else >> + shmem_obj = drm_gem_shmem_create(dev, >> unaligned_size); >> + >> if (IS_ERR(shmem_obj)) >> return ERR_CAST(shmem_obj); >> bo = to_v3d_bo(&shmem_obj->base); > > > if I read this correctly when we have THP we always allocate with that, > Even objects that are smaller than 64KB. I was wondering if there is > any benefit/downside to this or if the behavior for small allocations > is the same we had without the new mount point. I'm assuming that your concern is related to memory pressure and memory fragmentation. As we are using `huge=within_size`, we only allocate a huge page if it will be fully within `i_size`. When using `huge=within_size`, we can optimize the performance for smaller files without forcing larger files to also use huge pages. I don't understand `huge=within_size` in full details, but it is possible to check that it is able to avoid the system (even the RPi) to go OOM. Although it is slightly less performant than `huge=always` (used in v1), I believe it is more ideal for a system such as the RPi due to the memory requirements. Best Regards, - Maíra > > Iago
diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c index 79e31c5299b1..16ac26c31c6b 100644 --- a/drivers/gpu/drm/v3d/v3d_bo.c +++ b/drivers/gpu/drm/v3d/v3d_bo.c @@ -94,6 +94,7 @@ v3d_bo_create_finish(struct drm_gem_object *obj) struct v3d_dev *v3d = to_v3d_dev(obj->dev); struct v3d_bo *bo = to_v3d_bo(obj); struct sg_table *sgt; + u64 align; int ret; /* So far we pin the BO in the MMU for its lifetime, so use @@ -103,6 +104,15 @@ v3d_bo_create_finish(struct drm_gem_object *obj) if (IS_ERR(sgt)) return PTR_ERR(sgt); + if (!v3d->gemfs) + align = SZ_4K; + else if (obj->size >= SZ_1M) + align = SZ_1M; + else if (obj->size >= SZ_64K) + align = SZ_64K; + else + align = SZ_4K; + spin_lock(&v3d->mm_lock); /* Allocate the object's space in the GPU's page tables. * Inserting PTEs will happen later, but the offset is for the @@ -110,7 +120,7 @@ v3d_bo_create_finish(struct drm_gem_object *obj) */ ret = drm_mm_insert_node_generic(&v3d->mm, &bo->node, obj->size >> V3D_MMU_PAGE_SHIFT, - SZ_4K >> V3D_MMU_PAGE_SHIFT, 0, 0); + align >> V3D_MMU_PAGE_SHIFT, 0, 0); spin_unlock(&v3d->mm_lock); if (ret) return ret; @@ -130,10 +140,17 @@ struct v3d_bo *v3d_bo_create(struct drm_device *dev, struct drm_file *file_priv, size_t unaligned_size) { struct drm_gem_shmem_object *shmem_obj; + struct v3d_dev *v3d = to_v3d_dev(dev); struct v3d_bo *bo; int ret; - shmem_obj = drm_gem_shmem_create(dev, unaligned_size); + /* Let the user opt out of allocating the BOs with THP */ + if (v3d->gemfs) + shmem_obj = drm_gem_shmem_create_with_mnt(dev, unaligned_size, + v3d->gemfs); + else + shmem_obj = drm_gem_shmem_create(dev, unaligned_size); + if (IS_ERR(shmem_obj)) return ERR_CAST(shmem_obj); bo = to_v3d_bo(&shmem_obj->base);