Message ID | 20241004194207.1013744-2-sui.jingfeng@linux.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix GPU virtual address collosion when CPU page size != GPU page size | expand |
Am Samstag, dem 05.10.2024 um 03:42 +0800 schrieb Sui Jingfeng: > Etnaviv assumes that GPU page size is 4KiB, yet on some systems, the CPU > page size is 16KiB. The size of etnaviv buffer objects will be aligned > to CPU page size on kernel side, however, userspace still assumes the > page size is 4KiB and doing allocation with 4KiB page as unit. This > results in userspace allocated GPU virtual address range collision and > therefore unable to be inserted to the specified hole exactly. > > The root cause is that kernel side BO takes up bigger address space than > userspace assumes when the size of it is not CPU page size aligned. To > Preserve GPU VA continuous as much as possible, track the size that > userspace/GPU think of it is. > > Yes, we still need to overallocate to suit the CPU, but there is no need > to waste GPU VA space anymore. > > Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev> > --- > drivers/gpu/drm/etnaviv/etnaviv_gem.c | 8 +++++--- > drivers/gpu/drm/etnaviv/etnaviv_gem.h | 1 + > drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 8 ++++---- > 3 files changed, 10 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c > index 5c0c9d4e3be1..943fc20093e6 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c > @@ -543,7 +543,7 @@ static const struct drm_gem_object_funcs etnaviv_gem_object_funcs = { > .vm_ops = &vm_ops, > }; > > -static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, > +static int etnaviv_gem_new_impl(struct drm_device *dev, u32 size, u32 flags, > const struct etnaviv_gem_ops *ops, struct drm_gem_object **obj) > { > struct etnaviv_gem_object *etnaviv_obj; > @@ -570,6 +570,7 @@ static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, > if (!etnaviv_obj) > return -ENOMEM; > > + etnaviv_obj->user_size = size; > etnaviv_obj->flags = flags; > etnaviv_obj->ops = ops; > > @@ -588,11 +589,12 @@ int etnaviv_gem_new_handle(struct drm_device *dev, struct drm_file *file, > { > struct etnaviv_drm_private *priv = dev->dev_private; > struct drm_gem_object *obj = NULL; > + unsigned int user_size = size; This still needs to be be aligned to 4K. Userspace may request unaligned buffer sizes and we don't want to risk any confusion about which part is visible to the GPU, so better make sure this size is aligned to the GPU page size. Also, that more personal preference, but I would call this gpu_size or something like that, to avoid any confusion with the user_size in etnaviv_cmdbuf, where user_size doesn't denote the GPU visible size. Regards, Lucas > int ret; > > size = PAGE_ALIGN(size); > > - ret = etnaviv_gem_new_impl(dev, flags, &etnaviv_gem_shmem_ops, &obj); > + ret = etnaviv_gem_new_impl(dev, user_size, flags, &etnaviv_gem_shmem_ops, &obj); > if (ret) > goto fail; > > @@ -627,7 +629,7 @@ int etnaviv_gem_new_private(struct drm_device *dev, size_t size, u32 flags, > struct drm_gem_object *obj; > int ret; > > - ret = etnaviv_gem_new_impl(dev, flags, ops, &obj); > + ret = etnaviv_gem_new_impl(dev, size, flags, ops, &obj); > if (ret) > return ret; > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h > index a42d260cac2c..c6e27b9abb0c 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h > @@ -36,6 +36,7 @@ struct etnaviv_gem_object { > const struct etnaviv_gem_ops *ops; > struct mutex lock; > > + u32 user_size; > u32 flags; > > struct list_head gem_node; > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c > index 1661d589bf3e..6fbc62772d85 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c > @@ -281,6 +281,7 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, > { > struct sg_table *sgt = etnaviv_obj->sgt; > struct drm_mm_node *node; > + unsigned int user_size; > int ret; > > lockdep_assert_held(&etnaviv_obj->lock); > @@ -303,13 +304,12 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, > } > > node = &mapping->vram_node; > + user_size = etnaviv_obj->user_size; > > if (va) > - ret = etnaviv_iommu_insert_exact(context, node, > - etnaviv_obj->base.size, va); > + ret = etnaviv_iommu_insert_exact(context, node, user_size, va); > else > - ret = etnaviv_iommu_find_iova(context, node, > - etnaviv_obj->base.size); > + ret = etnaviv_iommu_find_iova(context, node, user_size); > if (ret < 0) > goto unlock; >
Hi, On 2024/10/7 18:12, Lucas Stach wrote: > Am Samstag, dem 05.10.2024 um 03:42 +0800 schrieb Sui Jingfeng: >> Etnaviv assumes that GPU page size is 4KiB, yet on some systems, the CPU >> page size is 16KiB. The size of etnaviv buffer objects will be aligned >> to CPU page size on kernel side, however, userspace still assumes the >> page size is 4KiB and doing allocation with 4KiB page as unit. This >> results in userspace allocated GPU virtual address range collision and >> therefore unable to be inserted to the specified hole exactly. >> >> The root cause is that kernel side BO takes up bigger address space than >> userspace assumes when the size of it is not CPU page size aligned. To >> Preserve GPU VA continuous as much as possible, track the size that >> userspace/GPU think of it is. >> >> Yes, we still need to overallocate to suit the CPU, but there is no need >> to waste GPU VA space anymore. >> >> Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev> >> --- >> drivers/gpu/drm/etnaviv/etnaviv_gem.c | 8 +++++--- >> drivers/gpu/drm/etnaviv/etnaviv_gem.h | 1 + >> drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 8 ++++---- >> 3 files changed, 10 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c >> index 5c0c9d4e3be1..943fc20093e6 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c >> @@ -543,7 +543,7 @@ static const struct drm_gem_object_funcs etnaviv_gem_object_funcs = { >> .vm_ops = &vm_ops, >> }; >> >> -static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, >> +static int etnaviv_gem_new_impl(struct drm_device *dev, u32 size, u32 flags, >> const struct etnaviv_gem_ops *ops, struct drm_gem_object **obj) >> { >> struct etnaviv_gem_object *etnaviv_obj; >> @@ -570,6 +570,7 @@ static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, >> if (!etnaviv_obj) >> return -ENOMEM; >> >> + etnaviv_obj->user_size = size; >> etnaviv_obj->flags = flags; >> etnaviv_obj->ops = ops; >> >> @@ -588,11 +589,12 @@ int etnaviv_gem_new_handle(struct drm_device *dev, struct drm_file *file, >> { >> struct etnaviv_drm_private *priv = dev->dev_private; >> struct drm_gem_object *obj = NULL; >> + unsigned int user_size = size; > This still needs to be be aligned to 4K. Userspace may request > unaligned buffer sizes and we don't want to risk any confusion about > which part is visible to the GPU, so better make sure this size is > aligned to the GPU page size. OK,aligned to the GPU page size is reasonable. Since the buffer is very high likely be used by GPU. > Also, that more personal preference, but I would call this gpu_size or > something like that, to avoid any confusion with the user_size in > etnaviv_cmdbuf, where user_size doesn't denote the GPU visible size. Yeah, theuser_size denote the length of command buffer, it's usually just need to aligned to 8 bytes. And generally, the size command buffer won't larger than 4KiB (a GPU PAGE). I'm imagine that just 'size' with some extra comment, as it's possible that a buffer is only get used by CPU for specific purpose. Best Regards, Sui > Regards, > Lucas > >> int ret; >> >> size = PAGE_ALIGN(size); >> >> - ret = etnaviv_gem_new_impl(dev, flags, &etnaviv_gem_shmem_ops, &obj); >> + ret = etnaviv_gem_new_impl(dev, user_size, flags, &etnaviv_gem_shmem_ops, &obj); >> if (ret) >> goto fail; >> >> @@ -627,7 +629,7 @@ int etnaviv_gem_new_private(struct drm_device *dev, size_t size, u32 flags, >> struct drm_gem_object *obj; >> int ret; >> >> - ret = etnaviv_gem_new_impl(dev, flags, ops, &obj); >> + ret = etnaviv_gem_new_impl(dev, size, flags, ops, &obj); >> if (ret) >> return ret; >> >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h >> index a42d260cac2c..c6e27b9abb0c 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h >> @@ -36,6 +36,7 @@ struct etnaviv_gem_object { >> const struct etnaviv_gem_ops *ops; >> struct mutex lock; >> >> + u32 user_size; >> u32 flags; >> >> struct list_head gem_node; >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c >> index 1661d589bf3e..6fbc62772d85 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c >> @@ -281,6 +281,7 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, >> { >> struct sg_table *sgt = etnaviv_obj->sgt; >> struct drm_mm_node *node; >> + unsigned int user_size; >> int ret; >> >> lockdep_assert_held(&etnaviv_obj->lock); >> @@ -303,13 +304,12 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, >> } >> >> node = &mapping->vram_node; >> + user_size = etnaviv_obj->user_size; >> >> if (va) >> - ret = etnaviv_iommu_insert_exact(context, node, >> - etnaviv_obj->base.size, va); >> + ret = etnaviv_iommu_insert_exact(context, node, user_size, va); >> else >> - ret = etnaviv_iommu_find_iova(context, node, >> - etnaviv_obj->base.size); >> + ret = etnaviv_iommu_find_iova(context, node, user_size); >> if (ret < 0) >> goto unlock; >>
Hi, On 10/7/24 18:12, Lucas Stach wrote: > Am Samstag, dem 05.10.2024 um 03:42 +0800 schrieb Sui Jingfeng: >> Etnaviv assumes that GPU page size is 4KiB, yet on some systems, the CPU >> page size is 16KiB. The size of etnaviv buffer objects will be aligned >> to CPU page size on kernel side, however, userspace still assumes the >> page size is 4KiB and doing allocation with 4KiB page as unit. This >> results in userspace allocated GPU virtual address range collision and >> therefore unable to be inserted to the specified hole exactly. >> >> The root cause is that kernel side BO takes up bigger address space than >> userspace assumes when the size of it is not CPU page size aligned. To >> Preserve GPU VA continuous as much as possible, track the size that >> userspace/GPU think of it is. >> >> Yes, we still need to overallocate to suit the CPU, but there is no need >> to waste GPU VA space anymore. >> >> Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev> >> --- >> drivers/gpu/drm/etnaviv/etnaviv_gem.c | 8 +++++--- >> drivers/gpu/drm/etnaviv/etnaviv_gem.h | 1 + >> drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 8 ++++---- >> 3 files changed, 10 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c >> index 5c0c9d4e3be1..943fc20093e6 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c >> @@ -543,7 +543,7 @@ static const struct drm_gem_object_funcs etnaviv_gem_object_funcs = { >> .vm_ops = &vm_ops, >> }; >> >> -static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, >> +static int etnaviv_gem_new_impl(struct drm_device *dev, u32 size, u32 flags, >> const struct etnaviv_gem_ops *ops, struct drm_gem_object **obj) >> { >> struct etnaviv_gem_object *etnaviv_obj; >> @@ -570,6 +570,7 @@ static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, >> if (!etnaviv_obj) >> return -ENOMEM; >> >> + etnaviv_obj->user_size = size; >> etnaviv_obj->flags = flags; >> etnaviv_obj->ops = ops; >> >> @@ -588,11 +589,12 @@ int etnaviv_gem_new_handle(struct drm_device *dev, struct drm_file *file, >> { >> struct etnaviv_drm_private *priv = dev->dev_private; >> struct drm_gem_object *obj = NULL; >> + unsigned int user_size = size; > > This still needs to be be aligned to 4K. Yes, extremely correct here, for the perspective of concept. Have to be GPU page size aligned, because the GPU map 4KiB once a time. GPU will access full range of a 4KiB page, and this is out of CPU's control. > Userspace may request unaligned buffer sizes User-space shall *NOT* request unaligned buffer, since user-space *already* made the assumption GPU page is 4KiB. Then it's the user space's responsibility that keeping requested buffer aligned. - The kernel space actually can and *should* return aligned size to user-space though. - Since softpin feature is landed, it becomes evident that kernel space need user-space *report* a correct length of GPUVA. But I'm fine with the kernel pay some extra price for safe reasons. Best regards, Sui > and we don't want to risk any confusion about > which part is visible to the GPU, so better make sure this size is > aligned to the GPU page size. > Also, that more personal preference, but I would call this gpu_size or > something like that, to avoid any confusion with the user_size in> etnaviv_cmdbuf, where user_size doesn't denote the GPU visible size. > > Regards, > Lucas > >> int ret; >> >> size = PAGE_ALIGN(size); >> >> - ret = etnaviv_gem_new_impl(dev, flags, &etnaviv_gem_shmem_ops, &obj); >> + ret = etnaviv_gem_new_impl(dev, user_size, flags, &etnaviv_gem_shmem_ops, &obj); >> if (ret) >> goto fail; >> >> @@ -627,7 +629,7 @@ int etnaviv_gem_new_private(struct drm_device *dev, size_t size, u32 flags, >> struct drm_gem_object *obj; >> int ret; >> >> - ret = etnaviv_gem_new_impl(dev, flags, ops, &obj); >> + ret = etnaviv_gem_new_impl(dev, size, flags, ops, &obj); >> if (ret) >> return ret; >> >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h >> index a42d260cac2c..c6e27b9abb0c 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h >> @@ -36,6 +36,7 @@ struct etnaviv_gem_object { >> const struct etnaviv_gem_ops *ops; >> struct mutex lock; >> >> + u32 user_size; >> u32 flags; >> >> struct list_head gem_node; >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c >> index 1661d589bf3e..6fbc62772d85 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c >> @@ -281,6 +281,7 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, >> { >> struct sg_table *sgt = etnaviv_obj->sgt; >> struct drm_mm_node *node; >> + unsigned int user_size; >> int ret; >> >> lockdep_assert_held(&etnaviv_obj->lock); >> @@ -303,13 +304,12 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, >> } >> >> node = &mapping->vram_node; >> + user_size = etnaviv_obj->user_size; >> >> if (va) >> - ret = etnaviv_iommu_insert_exact(context, node, >> - etnaviv_obj->base.size, va); >> + ret = etnaviv_iommu_insert_exact(context, node, user_size, va); >> else >> - ret = etnaviv_iommu_find_iova(context, node, >> - etnaviv_obj->base.size); >> + ret = etnaviv_iommu_find_iova(context, node, user_size); >> if (ret < 0) >> goto unlock; >> >
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c index 5c0c9d4e3be1..943fc20093e6 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c @@ -543,7 +543,7 @@ static const struct drm_gem_object_funcs etnaviv_gem_object_funcs = { .vm_ops = &vm_ops, }; -static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, +static int etnaviv_gem_new_impl(struct drm_device *dev, u32 size, u32 flags, const struct etnaviv_gem_ops *ops, struct drm_gem_object **obj) { struct etnaviv_gem_object *etnaviv_obj; @@ -570,6 +570,7 @@ static int etnaviv_gem_new_impl(struct drm_device *dev, u32 flags, if (!etnaviv_obj) return -ENOMEM; + etnaviv_obj->user_size = size; etnaviv_obj->flags = flags; etnaviv_obj->ops = ops; @@ -588,11 +589,12 @@ int etnaviv_gem_new_handle(struct drm_device *dev, struct drm_file *file, { struct etnaviv_drm_private *priv = dev->dev_private; struct drm_gem_object *obj = NULL; + unsigned int user_size = size; int ret; size = PAGE_ALIGN(size); - ret = etnaviv_gem_new_impl(dev, flags, &etnaviv_gem_shmem_ops, &obj); + ret = etnaviv_gem_new_impl(dev, user_size, flags, &etnaviv_gem_shmem_ops, &obj); if (ret) goto fail; @@ -627,7 +629,7 @@ int etnaviv_gem_new_private(struct drm_device *dev, size_t size, u32 flags, struct drm_gem_object *obj; int ret; - ret = etnaviv_gem_new_impl(dev, flags, ops, &obj); + ret = etnaviv_gem_new_impl(dev, size, flags, ops, &obj); if (ret) return ret; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h index a42d260cac2c..c6e27b9abb0c 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h @@ -36,6 +36,7 @@ struct etnaviv_gem_object { const struct etnaviv_gem_ops *ops; struct mutex lock; + u32 user_size; u32 flags; struct list_head gem_node; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c index 1661d589bf3e..6fbc62772d85 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c @@ -281,6 +281,7 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, { struct sg_table *sgt = etnaviv_obj->sgt; struct drm_mm_node *node; + unsigned int user_size; int ret; lockdep_assert_held(&etnaviv_obj->lock); @@ -303,13 +304,12 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context, } node = &mapping->vram_node; + user_size = etnaviv_obj->user_size; if (va) - ret = etnaviv_iommu_insert_exact(context, node, - etnaviv_obj->base.size, va); + ret = etnaviv_iommu_insert_exact(context, node, user_size, va); else - ret = etnaviv_iommu_find_iova(context, node, - etnaviv_obj->base.size); + ret = etnaviv_iommu_find_iova(context, node, user_size); if (ret < 0) goto unlock;
Etnaviv assumes that GPU page size is 4KiB, yet on some systems, the CPU page size is 16KiB. The size of etnaviv buffer objects will be aligned to CPU page size on kernel side, however, userspace still assumes the page size is 4KiB and doing allocation with 4KiB page as unit. This results in userspace allocated GPU virtual address range collision and therefore unable to be inserted to the specified hole exactly. The root cause is that kernel side BO takes up bigger address space than userspace assumes when the size of it is not CPU page size aligned. To Preserve GPU VA continuous as much as possible, track the size that userspace/GPU think of it is. Yes, we still need to overallocate to suit the CPU, but there is no need to waste GPU VA space anymore. Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev> --- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 8 +++++--- drivers/gpu/drm/etnaviv/etnaviv_gem.h | 1 + drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 8 ++++---- 3 files changed, 10 insertions(+), 7 deletions(-)