Message ID | 1477247507-11378-2-git-send-email-notasas@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Acked-by: Chunming Zhou <david1.zhou@amd.com> On 2016年10月24日 02:31, Grazvydas Ignotas wrote: > To free fences, call_rcu() is used, which calls amd_sched_fence_free() > after a grace period. During teardown, there is no guarantee all > callbacks have finished, so sched_fence_slab may be destroyed before > all fences have been freed. If we are lucky, this results in some slab > warnings, if not, we get a crash in one of rcu threads because callback > is called after amdgpu has already been unloaded. > > Fix it with a rcu_barrier(). > > Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release") > Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> > --- > drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c > index 963a24d..910b8d5 100644 > --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c > +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c > @@ -645,6 +645,7 @@ void amd_sched_fini(struct amd_gpu_scheduler *sched) > { > if (sched->thread) > kthread_stop(sched->thread); > + rcu_barrier(); > if (atomic_dec_and_test(&sched_fence_slab_ref)) > kmem_cache_destroy(sched_fence_slab); > }
Reviewed-by: Christian König <christian.koenig@amd.com> Am 24.10.2016 um 04:34 schrieb zhoucm1: > Acked-by: Chunming Zhou <david1.zhou@amd.com> > > On 2016年10月24日 02:31, Grazvydas Ignotas wrote: >> To free fences, call_rcu() is used, which calls amd_sched_fence_free() >> after a grace period. During teardown, there is no guarantee all >> callbacks have finished, so sched_fence_slab may be destroyed before >> all fences have been freed. If we are lucky, this results in some slab >> warnings, if not, we get a crash in one of rcu threads because callback >> is called after amdgpu has already been unloaded. >> >> Fix it with a rcu_barrier(). >> >> Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected >> amd_sched_fence_release") >> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> >> --- >> drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >> index 963a24d..910b8d5 100644 >> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >> @@ -645,6 +645,7 @@ void amd_sched_fini(struct amd_gpu_scheduler *sched) >> { >> if (sched->thread) >> kthread_stop(sched->thread); >> + rcu_barrier(); >> if (atomic_dec_and_test(&sched_fence_slab_ref)) >> kmem_cache_destroy(sched_fence_slab); >> } > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, Oct 24, 2016 at 5:06 AM, Christian König <deathsimple@vodafone.de> wrote: > Reviewed-by: Christian König <christian.koenig@amd.com> > > > Am 24.10.2016 um 04:34 schrieb zhoucm1: >> >> Acked-by: Chunming Zhou <david1.zhou@amd.com> >> >> On 2016年10月24日 02:31, Grazvydas Ignotas wrote: >>> >>> To free fences, call_rcu() is used, which calls amd_sched_fence_free() >>> after a grace period. During teardown, there is no guarantee all >>> callbacks have finished, so sched_fence_slab may be destroyed before >>> all fences have been freed. If we are lucky, this results in some slab >>> warnings, if not, we get a crash in one of rcu threads because callback >>> is called after amdgpu has already been unloaded. >>> >>> Fix it with a rcu_barrier(). >>> >>> Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release") >>> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> >>> --- >>> drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>> index 963a24d..910b8d5 100644 >>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>> @@ -645,6 +645,7 @@ void amd_sched_fini(struct amd_gpu_scheduler *sched) >>> { >>> if (sched->thread) >>> kthread_stop(sched->thread); >>> + rcu_barrier(); >>> if (atomic_dec_and_test(&sched_fence_slab_ref)) >>> kmem_cache_destroy(sched_fence_slab); >>> } >> Applied. thanks! Alex
diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c index 963a24d..910b8d5 100644 --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c @@ -645,6 +645,7 @@ void amd_sched_fini(struct amd_gpu_scheduler *sched) { if (sched->thread) kthread_stop(sched->thread); + rcu_barrier(); if (atomic_dec_and_test(&sched_fence_slab_ref)) kmem_cache_destroy(sched_fence_slab); }
To free fences, call_rcu() is used, which calls amd_sched_fence_free() after a grace period. During teardown, there is no guarantee all callbacks have finished, so sched_fence_slab may be destroyed before all fences have been freed. If we are lucky, this results in some slab warnings, if not, we get a crash in one of rcu threads because callback is called after amdgpu has already been unloaded. Fix it with a rcu_barrier(). Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release") Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> --- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 1 + 1 file changed, 1 insertion(+)