diff mbox series

drm/ttm: fix incorrect TT->SYSTEM move handling

Message ID 20200916142406.2379-1-christian.koenig@amd.com
State New, archived
Headers show
Series drm/ttm: fix incorrect TT->SYSTEM move handling | expand

Commit Message

Christian König Sept. 16, 2020, 2:24 p.m. UTC
When we move from the SYSTEM domain to the TT domain
we still need to potentially change the caching state.

This is most likely the source of a bunch of problems with
AGP and USWC together with hibernation and swap.

Signed-off-by: Christian König <christian.koenig@amd.com>
CC: stable@vger.kernel.org
---
 drivers/gpu/drm/ttm/ttm_bo.c | 8 --------
 1 file changed, 8 deletions(-)

Comments

Christian König Sept. 16, 2020, 2:49 p.m. UTC | #1
Am 16.09.20 um 16:24 schrieb Christian König:
> When we move from the SYSTEM domain to the TT domain

Typo in the subject, the order in the sentence here is the correct one.

This is an important bug fix I'm hunting for years. Please review.

Thanks,
Christian.

> we still need to potentially change the caching state.
>
> This is most likely the source of a bunch of problems with
> AGP and USWC together with hibernation and swap.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> CC: stable@vger.kernel.org
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c | 8 --------
>   1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index ffbdc20d8e8d..5f7efc90970e 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -264,13 +264,6 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>   			if (ret)
>   				goto out_err;
>   		}
> -
> -		if (bo->mem.mem_type == TTM_PL_SYSTEM) {
> -			if (bdev->driver->move_notify)
> -				bdev->driver->move_notify(bo, evict, mem);
> -			bo->mem = *mem;
> -			goto moved;
> -		}
>   	}
>   
>   	if (bdev->driver->move_notify)
> @@ -293,7 +286,6 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>   		goto out_err;
>   	}
>   
> -moved:
>   	bo->evicted = false;
>   
>   	ctx->bytes_moved += bo->num_pages << PAGE_SHIFT;
Dave Airlie Sept. 16, 2020, 7:27 p.m. UTC | #2
On Thu, 17 Sep 2020 at 00:24, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> When we move from the SYSTEM domain to the TT domain
> we still need to potentially change the caching state.
>
> This is most likely the source of a bunch of problems with
> AGP and USWC together with hibernation and swap.

I'm going to need more commentary, because I've been staring at this
code a lot in the past few days and I'm although I dislike this path,
I'm not sure what this brings.

The current code flow to me is for SYSTEM->TT domain

ttm_tt_create
ttm_tt_set_placement_caching (new placement)
ttm_tt_bind (can cause populate)
move_notify
replace pointers
evicted = false
return

The new flow looks like
ttm_tt_create
ttm_tt_set_placement_caching (new placement)
ttm_tt_bind (can cause populate)
move_notify
(via ttm_bo_move_ttm)
ttm_tt_set_placement_caching (new placement)
ttm_tt_bind
replace pointers
(back to main code)
evicted = false
return

Is the second set placement caching doing something different here? or
is there something that happens in move notify that we need to set
things afterwards?

Dave.

>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> CC: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/ttm/ttm_bo.c | 8 --------
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index ffbdc20d8e8d..5f7efc90970e 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -264,13 +264,6 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>                         if (ret)
>                                 goto out_err;
>                 }
> -
> -               if (bo->mem.mem_type == TTM_PL_SYSTEM) {
> -                       if (bdev->driver->move_notify)
> -                               bdev->driver->move_notify(bo, evict, mem);
> -                       bo->mem = *mem;
> -                       goto moved;
> -               }
>         }
>
>         if (bdev->driver->move_notify)
> @@ -293,7 +286,6 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>                 goto out_err;
>         }
>
> -moved:
>         bo->evicted = false;
>
>         ctx->bytes_moved += bo->num_pages << PAGE_SHIFT;
> --
> 2.17.1
>
Christian König Sept. 17, 2020, 8:12 a.m. UTC | #3
Am 16.09.20 um 21:27 schrieb Dave Airlie:
> On Thu, 17 Sep 2020 at 00:24, Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> When we move from the SYSTEM domain to the TT domain
>> we still need to potentially change the caching state.
>>
>> This is most likely the source of a bunch of problems with
>> AGP and USWC together with hibernation and swap.
> I'm going to need more commentary, because I've been staring at this
> code a lot in the past few days and I'm although I dislike this path,
> I'm not sure what this brings.
>
> The current code flow to me is for SYSTEM->TT domain
>
> ttm_tt_create
> ttm_tt_set_placement_caching (new placement)
> ttm_tt_bind (can cause populate)
> move_notify
> replace pointers
> evicted = false
> return
>
> The new flow looks like
> ttm_tt_create
> ttm_tt_set_placement_caching (new placement)
> ttm_tt_bind (can cause populate)
> move_notify
> (via ttm_bo_move_ttm)
> ttm_tt_set_placement_caching (new placement)
> ttm_tt_bind
> replace pointers
> (back to main code)
> evicted = false
> return
>
> Is the second set placement caching doing something different here? or
> is there something that happens in move notify that we need to set
> things afterwards?

Oh, I was blind. Haven't seen the call to ttm_tt_set_placement_caching() 
directly before this one.

So forget this one, need to keep searching why hibernation doesn't work :(

Christian.

>
> Dave.
>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> CC: stable@vger.kernel.org
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c | 8 --------
>>   1 file changed, 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index ffbdc20d8e8d..5f7efc90970e 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -264,13 +264,6 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>>                          if (ret)
>>                                  goto out_err;
>>                  }
>> -
>> -               if (bo->mem.mem_type == TTM_PL_SYSTEM) {
>> -                       if (bdev->driver->move_notify)
>> -                               bdev->driver->move_notify(bo, evict, mem);
>> -                       bo->mem = *mem;
>> -                       goto moved;
>> -               }
>>          }
>>
>>          if (bdev->driver->move_notify)
>> @@ -293,7 +286,6 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>>                  goto out_err;
>>          }
>>
>> -moved:
>>          bo->evicted = false;
>>
>>          ctx->bytes_moved += bo->num_pages << PAGE_SHIFT;
>> --
>> 2.17.1
>>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index ffbdc20d8e8d..5f7efc90970e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -264,13 +264,6 @@  static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
 			if (ret)
 				goto out_err;
 		}
-
-		if (bo->mem.mem_type == TTM_PL_SYSTEM) {
-			if (bdev->driver->move_notify)
-				bdev->driver->move_notify(bo, evict, mem);
-			bo->mem = *mem;
-			goto moved;
-		}
 	}
 
 	if (bdev->driver->move_notify)
@@ -293,7 +286,6 @@  static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
 		goto out_err;
 	}
 
-moved:
 	bo->evicted = false;
 
 	ctx->bytes_moved += bo->num_pages << PAGE_SHIFT;