diff mbox

drm/amdgpu: fix sched fence slab teardown

Message ID 1477247507-11378-2-git-send-email-notasas@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Grazvydas Ignotas Oct. 23, 2016, 6:31 p.m. UTC
To free fences, call_rcu() is used, which calls amd_sched_fence_free()
after a grace period. During teardown, there is no guarantee all
callbacks have finished, so sched_fence_slab may be destroyed before
all fences have been freed. If we are lucky, this results in some slab
warnings, if not, we get a crash in one of rcu threads because callback
is called after amdgpu has already been unloaded.

Fix it with a rcu_barrier().

Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release")
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
---
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Chunming Zhou Oct. 24, 2016, 2:34 a.m. UTC | #1
Acked-by: Chunming Zhou <david1.zhou@amd.com>

On 2016年10月24日 02:31, Grazvydas Ignotas wrote:
> To free fences, call_rcu() is used, which calls amd_sched_fence_free()
> after a grace period. During teardown, there is no guarantee all
> callbacks have finished, so sched_fence_slab may be destroyed before
> all fences have been freed. If we are lucky, this results in some slab
> warnings, if not, we get a crash in one of rcu threads because callback
> is called after amdgpu has already been unloaded.
>
> Fix it with a rcu_barrier().
>
> Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release")
> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
> ---
>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
> index 963a24d..910b8d5 100644
> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
> @@ -645,6 +645,7 @@ void amd_sched_fini(struct amd_gpu_scheduler *sched)
>   {
>   	if (sched->thread)
>   		kthread_stop(sched->thread);
> +	rcu_barrier();
>   	if (atomic_dec_and_test(&sched_fence_slab_ref))
>   		kmem_cache_destroy(sched_fence_slab);
>   }
Christian König Oct. 24, 2016, 9:06 a.m. UTC | #2
Reviewed-by: Christian König <christian.koenig@amd.com>

Am 24.10.2016 um 04:34 schrieb zhoucm1:
> Acked-by: Chunming Zhou <david1.zhou@amd.com>
>
> On 2016年10月24日 02:31, Grazvydas Ignotas wrote:
>> To free fences, call_rcu() is used, which calls amd_sched_fence_free()
>> after a grace period. During teardown, there is no guarantee all
>> callbacks have finished, so sched_fence_slab may be destroyed before
>> all fences have been freed. If we are lucky, this results in some slab
>> warnings, if not, we get a crash in one of rcu threads because callback
>> is called after amdgpu has already been unloaded.
>>
>> Fix it with a rcu_barrier().
>>
>> Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected 
>> amd_sched_fence_release")
>> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
>> ---
>>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c 
>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>> index 963a24d..910b8d5 100644
>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>> @@ -645,6 +645,7 @@ void amd_sched_fini(struct amd_gpu_scheduler *sched)
>>   {
>>       if (sched->thread)
>>           kthread_stop(sched->thread);
>> +    rcu_barrier();
>>       if (atomic_dec_and_test(&sched_fence_slab_ref))
>>           kmem_cache_destroy(sched_fence_slab);
>>   }
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Alex Deucher Oct. 24, 2016, 4:08 p.m. UTC | #3
On Mon, Oct 24, 2016 at 5:06 AM, Christian König
<deathsimple@vodafone.de> wrote:
> Reviewed-by: Christian König <christian.koenig@amd.com>
>
>
> Am 24.10.2016 um 04:34 schrieb zhoucm1:
>>
>> Acked-by: Chunming Zhou <david1.zhou@amd.com>
>>
>> On 2016年10月24日 02:31, Grazvydas Ignotas wrote:
>>>
>>> To free fences, call_rcu() is used, which calls amd_sched_fence_free()
>>> after a grace period. During teardown, there is no guarantee all
>>> callbacks have finished, so sched_fence_slab may be destroyed before
>>> all fences have been freed. If we are lucky, this results in some slab
>>> warnings, if not, we get a crash in one of rcu threads because callback
>>> is called after amdgpu has already been unloaded.
>>>
>>> Fix it with a rcu_barrier().
>>>
>>> Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release")
>>> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
>>> ---
>>>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>> index 963a24d..910b8d5 100644
>>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>> @@ -645,6 +645,7 @@ void amd_sched_fini(struct amd_gpu_scheduler *sched)
>>>   {
>>>       if (sched->thread)
>>>           kthread_stop(sched->thread);
>>> +    rcu_barrier();
>>>       if (atomic_dec_and_test(&sched_fence_slab_ref))
>>>           kmem_cache_destroy(sched_fence_slab);
>>>   }
>>

Applied.  thanks!

Alex
diff mbox

Patch

diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
index 963a24d..910b8d5 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
@@ -645,6 +645,7 @@  void amd_sched_fini(struct amd_gpu_scheduler *sched)
 {
 	if (sched->thread)
 		kthread_stop(sched->thread);
+	rcu_barrier();
 	if (atomic_dec_and_test(&sched_fence_slab_ref))
 		kmem_cache_destroy(sched_fence_slab);
 }