diff mbox series

[09/11] drm/ttm: convert EDEADLK into EAGAIN

Message ID 20190514123127.1650-9-christian.koenig@amd.com (mailing list archive)
State New, archived
Headers show
Series [01/11] drm/ttm: Make LRU removal optional. | expand

Commit Message

Christian König May 14, 2019, 12:31 p.m. UTC
Let userspace try again if we really run into a deadlock during eviction.

This has a low chance of live locking, but with guaranteed forward process.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Daniel Vetter May 15, 2019, 8:40 a.m. UTC | #1
On Tue, May 14, 2019 at 02:31:25PM +0200, Christian König wrote:
> Let userspace try again if we really run into a deadlock during eviction.
> 
> This has a low chance of live locking, but with guaranteed forward process.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index a301c876ae31..ce85cd8b4970 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -877,6 +877,8 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
>  					    acquire_ctx);
>  		if (ret) {
>  			ttm_bo_put(first_bo);
> +			if (ret == -EDEADLK)

Is this the ww_mutex EDEADLK or something else? If the former then letting
that escape unhandling into userspace sounds like a kernel bug ...
-Daniel

> +				ret = -EAGAIN;
>  			return ret;
>  		}
>  		spin_lock(&glob->lru_lock);
> -- 
> 2.17.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Christian König May 15, 2019, 9:28 a.m. UTC | #2
Am 15.05.19 um 10:40 schrieb Daniel Vetter:
> On Tue, May 14, 2019 at 02:31:25PM +0200, Christian König wrote:
>> Let userspace try again if we really run into a deadlock during eviction.
>>
>> This has a low chance of live locking, but with guaranteed forward process.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index a301c876ae31..ce85cd8b4970 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -877,6 +877,8 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
>>   					    acquire_ctx);
>>   		if (ret) {
>>   			ttm_bo_put(first_bo);
>> +			if (ret == -EDEADLK)
> Is this the ww_mutex EDEADLK or something else? If the former then letting
> that escape unhandling into userspace sounds like a kernel bug ...

Yeah, the problem surfaced because of patch #4. Previously TTM would 
have just ignored all errors and continued to try different placements 
and only return -ENOMEM when we ran out of a possible placements.

I probably need to either fix patch #4 or reorder the patches.

Thanks for the note,
Christian.

> -Daniel
>
>> +				ret = -EAGAIN;
>>   			return ret;
>>   		}
>>   		spin_lock(&glob->lru_lock);
>> -- 
>> 2.17.1
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index a301c876ae31..ce85cd8b4970 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -877,6 +877,8 @@  static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
 					    acquire_ctx);
 		if (ret) {
 			ttm_bo_put(first_bo);
+			if (ret == -EDEADLK)
+				ret = -EAGAIN;
 			return ret;
 		}
 		spin_lock(&glob->lru_lock);