diff mbox series

drm/ttm/nouveau: don't call tt destroy callback on alloc failure.

Message ID 20200728041736.20689-1-airlied@gmail.com (mailing list archive)
State New, archived
Headers show
Series drm/ttm/nouveau: don't call tt destroy callback on alloc failure. | expand

Commit Message

Dave Airlie July 28, 2020, 4:17 a.m. UTC
From: Dave Airlie <airlied@redhat.com>

This is confusing, and from my reading of all the drivers only
nouveau got this right.

Just make the API act under driver control of it's own allocation
failing, and don't call destroy, if the page table fails to
create there is nothing to cleanup here.

(I'm willing to believe I've missed something here, so please
review deeply).

Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/nouveau/nouveau_sgdma.c | 9 +++------
 drivers/gpu/drm/ttm/ttm_tt.c            | 3 ---
 2 files changed, 3 insertions(+), 9 deletions(-)

Comments

Christian König July 28, 2020, 7:48 a.m. UTC | #1
Am 28.07.20 um 06:17 schrieb Dave Airlie:
> From: Dave Airlie <airlied@redhat.com>
>
> This is confusing, and from my reading of all the drivers only
> nouveau got this right.
>
> Just make the API act under driver control of it's own allocation
> failing, and don't call destroy, if the page table fails to
> create there is nothing to cleanup here.
>
> (I'm willing to believe I've missed something here, so please
> review deeply).
>
> Signed-off-by: Dave Airlie <airlied@redhat.com>

That looks right to me as well, Reviewed-by: Christian König 
<christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/nouveau/nouveau_sgdma.c | 9 +++------
>   drivers/gpu/drm/ttm/ttm_tt.c            | 3 ---
>   2 files changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sgdma.c b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
> index 20b6d0b3de5c..c3ccf661b7a6 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sgdma.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
> @@ -95,12 +95,9 @@ nouveau_sgdma_create_ttm(struct ttm_buffer_object *bo, uint32_t page_flags)
>   	else
>   		nvbe->ttm.ttm.func = &nv50_sgdma_backend;
>   
> -	if (ttm_dma_tt_init(&nvbe->ttm, bo, page_flags))
> -		/*
> -		 * A failing ttm_dma_tt_init() will call ttm_tt_destroy()
> -		 * and thus our nouveau_sgdma_destroy() hook, so we don't need
> -		 * to free nvbe here.
> -		 */
> +	if (ttm_dma_tt_init(&nvbe->ttm, bo, page_flags)) {
> +		kfree(nvbe);
>   		return NULL;
> +	}
>   	return &nvbe->ttm.ttm;
>   }
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index bab67873cfd4..9d1c7177384c 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -244,7 +244,6 @@ int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
>   	ttm_tt_init_fields(ttm, bo, page_flags);
>   
>   	if (ttm_tt_alloc_page_directory(ttm)) {
> -		ttm_tt_destroy(ttm);
>   		pr_err("Failed allocating page table\n");
>   		return -ENOMEM;
>   	}
> @@ -268,7 +267,6 @@ int ttm_dma_tt_init(struct ttm_dma_tt *ttm_dma, struct ttm_buffer_object *bo,
>   
>   	INIT_LIST_HEAD(&ttm_dma->pages_list);
>   	if (ttm_dma_tt_alloc_page_directory(ttm_dma)) {
> -		ttm_tt_destroy(ttm);
>   		pr_err("Failed allocating page table\n");
>   		return -ENOMEM;
>   	}
> @@ -290,7 +288,6 @@ int ttm_sg_tt_init(struct ttm_dma_tt *ttm_dma, struct ttm_buffer_object *bo,
>   	else
>   		ret = ttm_dma_tt_alloc_page_directory(ttm_dma);
>   	if (ret) {
> -		ttm_tt_destroy(ttm);
>   		pr_err("Failed allocating page table\n");
>   		return -ENOMEM;
>   	}
Dave Airlie July 29, 2020, 12:08 a.m. UTC | #2
On Tue, 28 Jul 2020 at 17:49, Christian König <christian.koenig@amd.com> wrote:
>
> Am 28.07.20 um 06:17 schrieb Dave Airlie:
> > From: Dave Airlie <airlied@redhat.com>
> >
> > This is confusing, and from my reading of all the drivers only
> > nouveau got this right.
> >
> > Just make the API act under driver control of it's own allocation
> > failing, and don't call destroy, if the page table fails to
> > create there is nothing to cleanup here.
> >
> > (I'm willing to believe I've missed something here, so please
> > review deeply).
> >
> > Signed-off-by: Dave Airlie <airlied@redhat.com>
>
> That looks right to me as well, Reviewed-by: Christian König
> <christian.koenig@amd.com>

Landed this in drm-next as well.

Dave.
Christian König July 29, 2020, 12:17 p.m. UTC | #3
Am 29.07.20 um 02:08 schrieb Dave Airlie:
> [SNIP]
> Landed this in drm-next as well.

By the way there is another design bug in TTM which only affects 
Nouveau. See ttm_mem_io_reserve_vm() and ttm_mem_io_free_vm().


What happens is that as soon as you have an userspace VM mapping the BO 
ends up on the LRU and a flag noting that so that we don't free the VM 
reference twice.

Additional to that we have a reference count to avoid calling the 
io_mem_reserve and io_mem_free callbacks unbalanced if there are both 
userspace VM mappings and kernel mappings for CPU copies of BOs or 
kmap() etc...


But this design is complete nonsense.

What we need instead is some handling which prevents adding the io_mem 
to the LRU as soon as there are any kernel mappings, because we can't 
restore those using a fault.

I already tried to move the whole handling into Nouveau itself, so that 
we can properly fix this up there. But Ben reported that my only compile 
tested patches don't work and without hardware I can't finish them up.


Regards,
Christian.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/nouveau/nouveau_sgdma.c b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
index 20b6d0b3de5c..c3ccf661b7a6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sgdma.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
@@ -95,12 +95,9 @@  nouveau_sgdma_create_ttm(struct ttm_buffer_object *bo, uint32_t page_flags)
 	else
 		nvbe->ttm.ttm.func = &nv50_sgdma_backend;
 
-	if (ttm_dma_tt_init(&nvbe->ttm, bo, page_flags))
-		/*
-		 * A failing ttm_dma_tt_init() will call ttm_tt_destroy()
-		 * and thus our nouveau_sgdma_destroy() hook, so we don't need
-		 * to free nvbe here.
-		 */
+	if (ttm_dma_tt_init(&nvbe->ttm, bo, page_flags)) {
+		kfree(nvbe);
 		return NULL;
+	}
 	return &nvbe->ttm.ttm;
 }
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index bab67873cfd4..9d1c7177384c 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -244,7 +244,6 @@  int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
 	ttm_tt_init_fields(ttm, bo, page_flags);
 
 	if (ttm_tt_alloc_page_directory(ttm)) {
-		ttm_tt_destroy(ttm);
 		pr_err("Failed allocating page table\n");
 		return -ENOMEM;
 	}
@@ -268,7 +267,6 @@  int ttm_dma_tt_init(struct ttm_dma_tt *ttm_dma, struct ttm_buffer_object *bo,
 
 	INIT_LIST_HEAD(&ttm_dma->pages_list);
 	if (ttm_dma_tt_alloc_page_directory(ttm_dma)) {
-		ttm_tt_destroy(ttm);
 		pr_err("Failed allocating page table\n");
 		return -ENOMEM;
 	}
@@ -290,7 +288,6 @@  int ttm_sg_tt_init(struct ttm_dma_tt *ttm_dma, struct ttm_buffer_object *bo,
 	else
 		ret = ttm_dma_tt_alloc_page_directory(ttm_dma);
 	if (ret) {
-		ttm_tt_destroy(ttm);
 		pr_err("Failed allocating page table\n");
 		return -ENOMEM;
 	}