Message ID | 1613599181-9492-1-git-send-email-andrey.grodzovsky@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drm/scheduler: Fix hang when sched_entity released | expand |
Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: > Problem: If scheduler is already stopped by the time sched_entity > is released and entity's job_queue not empty I encountred > a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle > never becomes false. > > Fix: In drm_sched_fini detach all sched_entities from the > scheduler's run queues. This will satisfy drm_sched_entity_is_idle. > Also wakeup all those processes stuck in sched_entity flushing > as the scheduler main thread which wakes them up is stopped by now. > > v2: > Reverse order of drm_sched_rq_remove_entity and marking > s_entity as stopped to prevent reinserion back to rq due > to race. > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 908b0b5..c6b7947 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); > */ > void drm_sched_fini(struct drm_gpu_scheduler *sched) > { > + int i; > + struct drm_sched_entity *s_entity; BTW: Please order that so that i is declared last. > if (sched->thread) > kthread_stop(sched->thread); > > + /* Detach all sched_entites from this scheduler once it's stopped */ > + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { > + struct drm_sched_rq *rq = &sched->sched_rq[i]; > + > + if (!rq) > + continue; > + > + /* Loop this way because rq->lock is taken in drm_sched_rq_remove_entity */ > + spin_lock(&rq->lock); > + while ((s_entity = list_first_entry_or_null(&rq->entities, > + struct drm_sched_entity, > + list))) { > + spin_unlock(&rq->lock); > + > + /* Prevent reinsertion and remove */ > + spin_lock(&s_entity->rq_lock); > + s_entity->stopped = true; > + drm_sched_rq_remove_entity(rq, s_entity); > + spin_unlock(&s_entity->rq_lock); Well this spin_unlock/lock dance here doesn't look correct at all now. Christian. > + > + spin_lock(&rq->lock); > + } > + spin_unlock(&rq->lock); > + > + } > + > + /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ > + wake_up_all(&sched->job_scheduled); > + > /* Confirm no work left behind accessing device structures */ > cancel_delayed_work_sync(&sched->work_tdr); >
On 2/18/21 3:07 AM, Christian König wrote: > > > Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >> Problem: If scheduler is already stopped by the time sched_entity >> is released and entity's job_queue not empty I encountred >> a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle >> never becomes false. >> >> Fix: In drm_sched_fini detach all sched_entities from the >> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >> Also wakeup all those processes stuck in sched_entity flushing >> as the scheduler main thread which wakes them up is stopped by now. >> >> v2: >> Reverse order of drm_sched_rq_remove_entity and marking >> s_entity as stopped to prevent reinserion back to rq due >> to race. >> >> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 31 +++++++++++++++++++++++++++++++ >> 1 file changed, 31 insertions(+) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >> b/drivers/gpu/drm/scheduler/sched_main.c >> index 908b0b5..c6b7947 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >> */ >> void drm_sched_fini(struct drm_gpu_scheduler *sched) >> { >> + int i; >> + struct drm_sched_entity *s_entity; > > BTW: Please order that so that i is declared last. > >> if (sched->thread) >> kthread_stop(sched->thread); >> + /* Detach all sched_entites from this scheduler once it's stopped */ >> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { >> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >> + >> + if (!rq) >> + continue; >> + >> + /* Loop this way because rq->lock is taken in >> drm_sched_rq_remove_entity */ >> + spin_lock(&rq->lock); >> + while ((s_entity = list_first_entry_or_null(&rq->entities, >> + struct drm_sched_entity, >> + list))) { >> + spin_unlock(&rq->lock); >> + >> + /* Prevent reinsertion and remove */ >> + spin_lock(&s_entity->rq_lock); >> + s_entity->stopped = true; >> + drm_sched_rq_remove_entity(rq, s_entity); >> + spin_unlock(&s_entity->rq_lock); > > Well this spin_unlock/lock dance here doesn't look correct at all now. > > Christian. In what way ? It's in the same same order as in other call sites (see drm_sched_entity_push_job and drm_sched_entity_flush). If i just locked rq->lock and did list_for_each_entry_safe while manually deleting entity->list instead of calling drm_sched_rq_remove_entity this still would not be possible as the order of lock acquisition between s_entity->rq_lock and rq->lock would be reverse compared to the call sites mentioned above. Andrey > >> + >> + spin_lock(&rq->lock); >> + } >> + spin_unlock(&rq->lock); >> + >> + } >> + >> + /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ >> + wake_up_all(&sched->job_scheduled); >> + >> /* Confirm no work left behind accessing device structures */ >> cancel_delayed_work_sync(&sched->work_tdr); >
Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: > > On 2/18/21 3:07 AM, Christian König wrote: >> >> >> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>> Problem: If scheduler is already stopped by the time sched_entity >>> is released and entity's job_queue not empty I encountred >>> a hang in drm_sched_entity_flush. This is because >>> drm_sched_entity_is_idle >>> never becomes false. >>> >>> Fix: In drm_sched_fini detach all sched_entities from the >>> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >>> Also wakeup all those processes stuck in sched_entity flushing >>> as the scheduler main thread which wakes them up is stopped by now. >>> >>> v2: >>> Reverse order of drm_sched_rq_remove_entity and marking >>> s_entity as stopped to prevent reinserion back to rq due >>> to race. >>> >>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>> --- >>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>> +++++++++++++++++++++++++++++++ >>> 1 file changed, 31 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>> b/drivers/gpu/drm/scheduler/sched_main.c >>> index 908b0b5..c6b7947 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>> */ >>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>> { >>> + int i; >>> + struct drm_sched_entity *s_entity; >> >> BTW: Please order that so that i is declared last. >> >>> if (sched->thread) >>> kthread_stop(sched->thread); >>> + /* Detach all sched_entites from this scheduler once it's >>> stopped */ >>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>> DRM_SCHED_PRIORITY_MIN; i--) { >>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>> + >>> + if (!rq) >>> + continue; >>> + >>> + /* Loop this way because rq->lock is taken in >>> drm_sched_rq_remove_entity */ >>> + spin_lock(&rq->lock); >>> + while ((s_entity = list_first_entry_or_null(&rq->entities, >>> + struct drm_sched_entity, >>> + list))) { >>> + spin_unlock(&rq->lock); >>> + >>> + /* Prevent reinsertion and remove */ >>> + spin_lock(&s_entity->rq_lock); >>> + s_entity->stopped = true; >>> + drm_sched_rq_remove_entity(rq, s_entity); >>> + spin_unlock(&s_entity->rq_lock); >> >> Well this spin_unlock/lock dance here doesn't look correct at all now. >> >> Christian. > > > In what way ? It's in the same same order as in other call sites (see > drm_sched_entity_push_job and drm_sched_entity_flush). > If i just locked rq->lock and did list_for_each_entry_safe while > manually deleting entity->list instead of calling > drm_sched_rq_remove_entity this still would not be possible as the > order of lock acquisition between s_entity->rq_lock > and rq->lock would be reverse compared to the call sites mentioned above. Ah, now I understand. You need this because drm_sched_rq_remove_entity() will grab the rq lock again! Problem is now what prevents the entity from being destroyed while you remove it? Christian. > > Andrey > > > >> >>> + >>> + spin_lock(&rq->lock); >>> + } >>> + spin_unlock(&rq->lock); >>> + >>> + } >>> + >>> + /* Wakeup everyone stuck in drm_sched_entity_flush for this >>> scheduler */ >>> + wake_up_all(&sched->job_scheduled); >>> + >>> /* Confirm no work left behind accessing device structures */ >>> cancel_delayed_work_sync(&sched->work_tdr); >>
On 2/18/21 10:15 AM, Christian König wrote: > Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >> >> On 2/18/21 3:07 AM, Christian König wrote: >>> >>> >>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>> Problem: If scheduler is already stopped by the time sched_entity >>>> is released and entity's job_queue not empty I encountred >>>> a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle >>>> never becomes false. >>>> >>>> Fix: In drm_sched_fini detach all sched_entities from the >>>> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >>>> Also wakeup all those processes stuck in sched_entity flushing >>>> as the scheduler main thread which wakes them up is stopped by now. >>>> >>>> v2: >>>> Reverse order of drm_sched_rq_remove_entity and marking >>>> s_entity as stopped to prevent reinserion back to rq due >>>> to race. >>>> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>> --- >>>> drivers/gpu/drm/scheduler/sched_main.c | 31 +++++++++++++++++++++++++++++++ >>>> 1 file changed, 31 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>> index 908b0b5..c6b7947 100644 >>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>> */ >>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>> { >>>> + int i; >>>> + struct drm_sched_entity *s_entity; >>> >>> BTW: Please order that so that i is declared last. >>> >>>> if (sched->thread) >>>> kthread_stop(sched->thread); >>>> + /* Detach all sched_entites from this scheduler once it's stopped */ >>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; >>>> i--) { >>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>> + >>>> + if (!rq) >>>> + continue; >>>> + >>>> + /* Loop this way because rq->lock is taken in >>>> drm_sched_rq_remove_entity */ >>>> + spin_lock(&rq->lock); >>>> + while ((s_entity = list_first_entry_or_null(&rq->entities, >>>> + struct drm_sched_entity, >>>> + list))) { >>>> + spin_unlock(&rq->lock); >>>> + >>>> + /* Prevent reinsertion and remove */ >>>> + spin_lock(&s_entity->rq_lock); >>>> + s_entity->stopped = true; >>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>> + spin_unlock(&s_entity->rq_lock); >>> >>> Well this spin_unlock/lock dance here doesn't look correct at all now. >>> >>> Christian. >> >> >> In what way ? It's in the same same order as in other call sites (see >> drm_sched_entity_push_job and drm_sched_entity_flush). >> If i just locked rq->lock and did list_for_each_entry_safe while manually >> deleting entity->list instead of calling >> drm_sched_rq_remove_entity this still would not be possible as the order of >> lock acquisition between s_entity->rq_lock >> and rq->lock would be reverse compared to the call sites mentioned above. > > Ah, now I understand. You need this because drm_sched_rq_remove_entity() will > grab the rq lock again! > > Problem is now what prevents the entity from being destroyed while you remove it? > > Christian. Right, well, since (unfortunately) sched_entity is part of amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted there is a problem here that we don't increment amdgpu_ctx.refcount when assigning sched_entity to new rq (e.g. before drm_sched_rq_add_entity) and not decrement before removing. We do it for amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and amdgpu_cs_parser_fini by calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit tricky due to all the drm_sched_entity_select_rq logic. Another, kind of a band aid fix, would probably be just locking amdgpu_ctx_mgr.lock around drm_sched_fini when finalizing the fence driver and around idr iteration in amdgpu_ctx_mgr_fini (which should be lock protected anyway as I see from other idr usages in the code) ... This should prevent this use after free. Andrey > >> >> Andrey >> >> >> >>> >>>> + >>>> + spin_lock(&rq->lock); >>>> + } >>>> + spin_unlock(&rq->lock); >>>> + >>>> + } >>>> + >>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ >>>> + wake_up_all(&sched->job_scheduled); >>>> + >>>> /* Confirm no work left behind accessing device structures */ >>>> cancel_delayed_work_sync(&sched->work_tdr); >>> >
Ping Andrey On 2/18/21 11:41 AM, Andrey Grodzovsky wrote: > > On 2/18/21 10:15 AM, Christian König wrote: >> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>> >>> On 2/18/21 3:07 AM, Christian König wrote: >>>> >>>> >>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>> Problem: If scheduler is already stopped by the time sched_entity >>>>> is released and entity's job_queue not empty I encountred >>>>> a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle >>>>> never becomes false. >>>>> >>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>> as the scheduler main thread which wakes them up is stopped by now. >>>>> >>>>> v2: >>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>> s_entity as stopped to prevent reinserion back to rq due >>>>> to race. >>>>> >>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>> --- >>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 +++++++++++++++++++++++++++++++ >>>>> 1 file changed, 31 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>> index 908b0b5..c6b7947 100644 >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>> */ >>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>> { >>>>> + int i; >>>>> + struct drm_sched_entity *s_entity; >>>> >>>> BTW: Please order that so that i is declared last. >>>> >>>>> if (sched->thread) >>>>> kthread_stop(sched->thread); >>>>> + /* Detach all sched_entites from this scheduler once it's stopped */ >>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; >>>>> i--) { >>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>> + >>>>> + if (!rq) >>>>> + continue; >>>>> + >>>>> + /* Loop this way because rq->lock is taken in >>>>> drm_sched_rq_remove_entity */ >>>>> + spin_lock(&rq->lock); >>>>> + while ((s_entity = list_first_entry_or_null(&rq->entities, >>>>> + struct drm_sched_entity, >>>>> + list))) { >>>>> + spin_unlock(&rq->lock); >>>>> + >>>>> + /* Prevent reinsertion and remove */ >>>>> + spin_lock(&s_entity->rq_lock); >>>>> + s_entity->stopped = true; >>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>> + spin_unlock(&s_entity->rq_lock); >>>> >>>> Well this spin_unlock/lock dance here doesn't look correct at all now. >>>> >>>> Christian. >>> >>> >>> In what way ? It's in the same same order as in other call sites (see >>> drm_sched_entity_push_job and drm_sched_entity_flush). >>> If i just locked rq->lock and did list_for_each_entry_safe while manually >>> deleting entity->list instead of calling >>> drm_sched_rq_remove_entity this still would not be possible as the order of >>> lock acquisition between s_entity->rq_lock >>> and rq->lock would be reverse compared to the call sites mentioned above. >> >> Ah, now I understand. You need this because drm_sched_rq_remove_entity() will >> grab the rq lock again! >> >> Problem is now what prevents the entity from being destroyed while you remove >> it? >> >> Christian. > > Right, well, since (unfortunately) sched_entity is part of amdgpu_ctx_entity > and amdgpu_ctx_entity is refcounted > there is a problem here that we don't increment amdgpu_ctx.refcount when > assigning sched_entity > to new rq (e.g. before drm_sched_rq_add_entity) and not decrement before > removing. We do it for > amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and > amdgpu_cs_parser_fini by > calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit tricky due to > all the drm_sched_entity_select_rq > logic. > > Another, kind of a band aid fix, would probably be just locking > amdgpu_ctx_mgr.lock around drm_sched_fini > when finalizing the fence driver and around idr iteration in > amdgpu_ctx_mgr_fini (which should be lock protected > anyway as I see from other idr usages in the code) ... This should prevent > this use after free. > > Andrey > > >> >>> >>> Andrey >>> >>> >>> >>>> >>>>> + >>>>> + spin_lock(&rq->lock); >>>>> + } >>>>> + spin_unlock(&rq->lock); >>>>> + >>>>> + } >>>>> + >>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ >>>>> + wake_up_all(&sched->job_scheduled); >>>>> + >>>>> /* Confirm no work left behind accessing device structures */ >>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>> >>
Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: > > On 2/18/21 10:15 AM, Christian König wrote: >> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>> >>> On 2/18/21 3:07 AM, Christian König wrote: >>>> >>>> >>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>> Problem: If scheduler is already stopped by the time sched_entity >>>>> is released and entity's job_queue not empty I encountred >>>>> a hang in drm_sched_entity_flush. This is because >>>>> drm_sched_entity_is_idle >>>>> never becomes false. >>>>> >>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>> as the scheduler main thread which wakes them up is stopped by now. >>>>> >>>>> v2: >>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>> s_entity as stopped to prevent reinserion back to rq due >>>>> to race. >>>>> >>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>> --- >>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>> +++++++++++++++++++++++++++++++ >>>>> 1 file changed, 31 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>> index 908b0b5..c6b7947 100644 >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>> */ >>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>> { >>>>> + int i; >>>>> + struct drm_sched_entity *s_entity; >>>> >>>> BTW: Please order that so that i is declared last. >>>> >>>>> if (sched->thread) >>>>> kthread_stop(sched->thread); >>>>> + /* Detach all sched_entites from this scheduler once it's >>>>> stopped */ >>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>> + >>>>> + if (!rq) >>>>> + continue; >>>>> + >>>>> + /* Loop this way because rq->lock is taken in >>>>> drm_sched_rq_remove_entity */ >>>>> + spin_lock(&rq->lock); >>>>> + while ((s_entity = list_first_entry_or_null(&rq->entities, >>>>> + struct drm_sched_entity, >>>>> + list))) { >>>>> + spin_unlock(&rq->lock); >>>>> + >>>>> + /* Prevent reinsertion and remove */ >>>>> + spin_lock(&s_entity->rq_lock); >>>>> + s_entity->stopped = true; >>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>> + spin_unlock(&s_entity->rq_lock); >>>> >>>> Well this spin_unlock/lock dance here doesn't look correct at all now. >>>> >>>> Christian. >>> >>> >>> In what way ? It's in the same same order as in other call sites >>> (see drm_sched_entity_push_job and drm_sched_entity_flush). >>> If i just locked rq->lock and did list_for_each_entry_safe while >>> manually deleting entity->list instead of calling >>> drm_sched_rq_remove_entity this still would not be possible as the >>> order of lock acquisition between s_entity->rq_lock >>> and rq->lock would be reverse compared to the call sites mentioned >>> above. >> >> Ah, now I understand. You need this because >> drm_sched_rq_remove_entity() will grab the rq lock again! >> >> Problem is now what prevents the entity from being destroyed while >> you remove it? >> >> Christian. > > Right, well, since (unfortunately) sched_entity is part of > amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted > there is a problem here that we don't increment amdgpu_ctx.refcount > when assigning sched_entity > to new rq (e.g. before drm_sched_rq_add_entity) and not decrement > before removing. We do it for > amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and > amdgpu_cs_parser_fini by > calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit > tricky due to all the drm_sched_entity_select_rq > logic. > > Another, kind of a band aid fix, would probably be just locking > amdgpu_ctx_mgr.lock around drm_sched_fini > when finalizing the fence driver and around idr iteration in > amdgpu_ctx_mgr_fini (which should be lock protected > anyway as I see from other idr usages in the code) ... This should > prevent this use after free. Puh, that's rather complicated as well. Ok let's look at it from the other side for a moment. Why do we have to remove the entities from the rq in the first place? Wouldn't it be sufficient to just set all of them to stopped? Christian. > > Andrey > > >> >>> >>> Andrey >>> >>> >>> >>>> >>>>> + >>>>> + spin_lock(&rq->lock); >>>>> + } >>>>> + spin_unlock(&rq->lock); >>>>> + >>>>> + } >>>>> + >>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for this >>>>> scheduler */ >>>>> + wake_up_all(&sched->job_scheduled); >>>>> + >>>>> /* Confirm no work left behind accessing device structures */ >>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>> >>
On 2/20/21 3:38 AM, Christian König wrote: > > > Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >> >> On 2/18/21 10:15 AM, Christian König wrote: >>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>> >>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>> >>>>> >>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>> Problem: If scheduler is already stopped by the time sched_entity >>>>>> is released and entity's job_queue not empty I encountred >>>>>> a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle >>>>>> never becomes false. >>>>>> >>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>> as the scheduler main thread which wakes them up is stopped by now. >>>>>> >>>>>> v2: >>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>> to race. >>>>>> >>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>> --- >>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>> +++++++++++++++++++++++++++++++ >>>>>> 1 file changed, 31 insertions(+) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>> index 908b0b5..c6b7947 100644 >>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>> */ >>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>> { >>>>>> + int i; >>>>>> + struct drm_sched_entity *s_entity; >>>>> >>>>> BTW: Please order that so that i is declared last. >>>>> >>>>>> if (sched->thread) >>>>>> kthread_stop(sched->thread); >>>>>> + /* Detach all sched_entites from this scheduler once it's stopped */ >>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; >>>>>> i--) { >>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>> + >>>>>> + if (!rq) >>>>>> + continue; >>>>>> + >>>>>> + /* Loop this way because rq->lock is taken in >>>>>> drm_sched_rq_remove_entity */ >>>>>> + spin_lock(&rq->lock); >>>>>> + while ((s_entity = list_first_entry_or_null(&rq->entities, >>>>>> + struct drm_sched_entity, >>>>>> + list))) { >>>>>> + spin_unlock(&rq->lock); >>>>>> + >>>>>> + /* Prevent reinsertion and remove */ >>>>>> + spin_lock(&s_entity->rq_lock); >>>>>> + s_entity->stopped = true; >>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>> + spin_unlock(&s_entity->rq_lock); >>>>> >>>>> Well this spin_unlock/lock dance here doesn't look correct at all now. >>>>> >>>>> Christian. >>>> >>>> >>>> In what way ? It's in the same same order as in other call sites (see >>>> drm_sched_entity_push_job and drm_sched_entity_flush). >>>> If i just locked rq->lock and did list_for_each_entry_safe while manually >>>> deleting entity->list instead of calling >>>> drm_sched_rq_remove_entity this still would not be possible as the order of >>>> lock acquisition between s_entity->rq_lock >>>> and rq->lock would be reverse compared to the call sites mentioned above. >>> >>> Ah, now I understand. You need this because drm_sched_rq_remove_entity() >>> will grab the rq lock again! >>> >>> Problem is now what prevents the entity from being destroyed while you >>> remove it? >>> >>> Christian. >> >> Right, well, since (unfortunately) sched_entity is part of amdgpu_ctx_entity >> and amdgpu_ctx_entity is refcounted >> there is a problem here that we don't increment amdgpu_ctx.refcount when >> assigning sched_entity >> to new rq (e.g. before drm_sched_rq_add_entity) and not decrement before >> removing. We do it for >> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and >> amdgpu_cs_parser_fini by >> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit tricky due >> to all the drm_sched_entity_select_rq >> logic. >> >> Another, kind of a band aid fix, would probably be just locking >> amdgpu_ctx_mgr.lock around drm_sched_fini >> when finalizing the fence driver and around idr iteration in >> amdgpu_ctx_mgr_fini (which should be lock protected >> anyway as I see from other idr usages in the code) ... This should prevent >> this use after free. > > Puh, that's rather complicated as well. Ok let's look at it from the other > side for a moment. > > Why do we have to remove the entities from the rq in the first place? > > Wouldn't it be sufficient to just set all of them to stopped? > > Christian. And adding it as another condition in drm_sched_entity_is_idle ? Andrey > >> >> Andrey >> >> >>> >>>> >>>> Andrey >>>> >>>> >>>> >>>>> >>>>>> + >>>>>> + spin_lock(&rq->lock); >>>>>> + } >>>>>> + spin_unlock(&rq->lock); >>>>>> + >>>>>> + } >>>>>> + >>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for this >>>>>> scheduler */ >>>>>> + wake_up_all(&sched->job_scheduled); >>>>>> + >>>>>> /* Confirm no work left behind accessing device structures */ >>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>> >>> >
Ping Andrey On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: > > On 2/20/21 3:38 AM, Christian König wrote: >> >> >> Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >>> >>> On 2/18/21 10:15 AM, Christian König wrote: >>>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>>> >>>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>>> >>>>>> >>>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>>> Problem: If scheduler is already stopped by the time sched_entity >>>>>>> is released and entity's job_queue not empty I encountred >>>>>>> a hang in drm_sched_entity_flush. This is because >>>>>>> drm_sched_entity_is_idle >>>>>>> never becomes false. >>>>>>> >>>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>>> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >>>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>>> as the scheduler main thread which wakes them up is stopped by now. >>>>>>> >>>>>>> v2: >>>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>>> to race. >>>>>>> >>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>>> +++++++++++++++++++++++++++++++ >>>>>>> 1 file changed, 31 insertions(+) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> index 908b0b5..c6b7947 100644 >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>>> */ >>>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>>> { >>>>>>> + int i; >>>>>>> + struct drm_sched_entity *s_entity; >>>>>> >>>>>> BTW: Please order that so that i is declared last. >>>>>> >>>>>>> if (sched->thread) >>>>>>> kthread_stop(sched->thread); >>>>>>> + /* Detach all sched_entites from this scheduler once it's >>>>>>> stopped */ >>>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>>> + >>>>>>> + if (!rq) >>>>>>> + continue; >>>>>>> + >>>>>>> + /* Loop this way because rq->lock is taken in >>>>>>> drm_sched_rq_remove_entity */ >>>>>>> + spin_lock(&rq->lock); >>>>>>> + while ((s_entity = list_first_entry_or_null(&rq->entities, >>>>>>> + struct drm_sched_entity, >>>>>>> + list))) { >>>>>>> + spin_unlock(&rq->lock); >>>>>>> + >>>>>>> + /* Prevent reinsertion and remove */ >>>>>>> + spin_lock(&s_entity->rq_lock); >>>>>>> + s_entity->stopped = true; >>>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>>> + spin_unlock(&s_entity->rq_lock); >>>>>> >>>>>> Well this spin_unlock/lock dance here doesn't look correct at all >>>>>> now. >>>>>> >>>>>> Christian. >>>>> >>>>> >>>>> In what way ? It's in the same same order as in other call sites >>>>> (see drm_sched_entity_push_job and drm_sched_entity_flush). >>>>> If i just locked rq->lock and did list_for_each_entry_safe while >>>>> manually deleting entity->list instead of calling >>>>> drm_sched_rq_remove_entity this still would not be possible as the >>>>> order of lock acquisition between s_entity->rq_lock >>>>> and rq->lock would be reverse compared to the call sites mentioned >>>>> above. >>>> >>>> Ah, now I understand. You need this because >>>> drm_sched_rq_remove_entity() will grab the rq lock again! >>>> >>>> Problem is now what prevents the entity from being destroyed while >>>> you remove it? >>>> >>>> Christian. >>> >>> Right, well, since (unfortunately) sched_entity is part of >>> amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted >>> there is a problem here that we don't increment amdgpu_ctx.refcount >>> when assigning sched_entity >>> to new rq (e.g. before drm_sched_rq_add_entity) and not decrement >>> before removing. We do it for >>> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and >>> amdgpu_cs_parser_fini by >>> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit >>> tricky due to all the drm_sched_entity_select_rq >>> logic. >>> >>> Another, kind of a band aid fix, would probably be just locking >>> amdgpu_ctx_mgr.lock around drm_sched_fini >>> when finalizing the fence driver and around idr iteration in >>> amdgpu_ctx_mgr_fini (which should be lock protected >>> anyway as I see from other idr usages in the code) ... This should >>> prevent this use after free. >> >> Puh, that's rather complicated as well. Ok let's look at it from the >> other side for a moment. >> >> Why do we have to remove the entities from the rq in the first place? >> >> Wouldn't it be sufficient to just set all of them to stopped? >> >> Christian. > > > And adding it as another condition in drm_sched_entity_is_idle ? > > Andrey > > >> >>> >>> Andrey >>> >>> >>>> >>>>> >>>>> Andrey >>>>> >>>>> >>>>> >>>>>> >>>>>>> + >>>>>>> + spin_lock(&rq->lock); >>>>>>> + } >>>>>>> + spin_unlock(&rq->lock); >>>>>>> + >>>>>>> + } >>>>>>> + >>>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for this >>>>>>> scheduler */ >>>>>>> + wake_up_all(&sched->job_scheduled); >>>>>>> + >>>>>>> /* Confirm no work left behind accessing device structures */ >>>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>>> >>>> >> > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cd948a126eeb84abf3e1308d8d598ca2f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637494199501121764%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=SAWjfsFZ%2BTi6jKVcyJtIC9qM8EgGthmt3MnSFlxbut8%3D&reserved=0 >
Ping Andrey On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: > > On 2/20/21 3:38 AM, Christian König wrote: >> >> >> Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >>> >>> On 2/18/21 10:15 AM, Christian König wrote: >>>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>>> >>>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>>> >>>>>> >>>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>>> Problem: If scheduler is already stopped by the time sched_entity >>>>>>> is released and entity's job_queue not empty I encountred >>>>>>> a hang in drm_sched_entity_flush. This is because >>>>>>> drm_sched_entity_is_idle >>>>>>> never becomes false. >>>>>>> >>>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>>> scheduler's run queues. This will satisfy drm_sched_entity_is_idle. >>>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>>> as the scheduler main thread which wakes them up is stopped by now. >>>>>>> >>>>>>> v2: >>>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>>> to race. >>>>>>> >>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>>> +++++++++++++++++++++++++++++++ >>>>>>> 1 file changed, 31 insertions(+) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> index 908b0b5..c6b7947 100644 >>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>>> */ >>>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>>> { >>>>>>> + int i; >>>>>>> + struct drm_sched_entity *s_entity; >>>>>> >>>>>> BTW: Please order that so that i is declared last. >>>>>> >>>>>>> if (sched->thread) >>>>>>> kthread_stop(sched->thread); >>>>>>> + /* Detach all sched_entites from this scheduler once it's >>>>>>> stopped */ >>>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>>> + >>>>>>> + if (!rq) >>>>>>> + continue; >>>>>>> + >>>>>>> + /* Loop this way because rq->lock is taken in >>>>>>> drm_sched_rq_remove_entity */ >>>>>>> + spin_lock(&rq->lock); >>>>>>> + while ((s_entity = list_first_entry_or_null(&rq->entities, >>>>>>> + struct drm_sched_entity, >>>>>>> + list))) { >>>>>>> + spin_unlock(&rq->lock); >>>>>>> + >>>>>>> + /* Prevent reinsertion and remove */ >>>>>>> + spin_lock(&s_entity->rq_lock); >>>>>>> + s_entity->stopped = true; >>>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>>> + spin_unlock(&s_entity->rq_lock); >>>>>> >>>>>> Well this spin_unlock/lock dance here doesn't look correct at all >>>>>> now. >>>>>> >>>>>> Christian. >>>>> >>>>> >>>>> In what way ? It's in the same same order as in other call sites >>>>> (see drm_sched_entity_push_job and drm_sched_entity_flush). >>>>> If i just locked rq->lock and did list_for_each_entry_safe while >>>>> manually deleting entity->list instead of calling >>>>> drm_sched_rq_remove_entity this still would not be possible as the >>>>> order of lock acquisition between s_entity->rq_lock >>>>> and rq->lock would be reverse compared to the call sites mentioned >>>>> above. >>>> >>>> Ah, now I understand. You need this because >>>> drm_sched_rq_remove_entity() will grab the rq lock again! >>>> >>>> Problem is now what prevents the entity from being destroyed while >>>> you remove it? >>>> >>>> Christian. >>> >>> Right, well, since (unfortunately) sched_entity is part of >>> amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted >>> there is a problem here that we don't increment amdgpu_ctx.refcount >>> when assigning sched_entity >>> to new rq (e.g. before drm_sched_rq_add_entity) and not decrement >>> before removing. We do it for >>> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and >>> amdgpu_cs_parser_fini by >>> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit >>> tricky due to all the drm_sched_entity_select_rq >>> logic. >>> >>> Another, kind of a band aid fix, would probably be just locking >>> amdgpu_ctx_mgr.lock around drm_sched_fini >>> when finalizing the fence driver and around idr iteration in >>> amdgpu_ctx_mgr_fini (which should be lock protected >>> anyway as I see from other idr usages in the code) ... This should >>> prevent this use after free. >> >> Puh, that's rather complicated as well. Ok let's look at it from the >> other side for a moment. >> >> Why do we have to remove the entities from the rq in the first place? >> >> Wouldn't it be sufficient to just set all of them to stopped? >> >> Christian. > > > And adding it as another condition in drm_sched_entity_is_idle ? > > Andrey > > >> >>> >>> Andrey >>> >>> >>>> >>>>> >>>>> Andrey >>>>> >>>>> >>>>> >>>>>> >>>>>>> + >>>>>>> + spin_lock(&rq->lock); >>>>>>> + } >>>>>>> + spin_unlock(&rq->lock); >>>>>>> + >>>>>>> + } >>>>>>> + >>>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for this >>>>>>> scheduler */ >>>>>>> + wake_up_all(&sched->job_scheduled); >>>>>>> + >>>>>>> /* Confirm no work left behind accessing device structures */ >>>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>>> >>>> >>
Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: > Ping Sorry, I've been on vacation this week. > > Andrey > > On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: >> >> On 2/20/21 3:38 AM, Christian König wrote: >>> >>> >>> Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >>>> >>>> On 2/18/21 10:15 AM, Christian König wrote: >>>>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>>>> >>>>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>>>> >>>>>>> >>>>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>>>> Problem: If scheduler is already stopped by the time sched_entity >>>>>>>> is released and entity's job_queue not empty I encountred >>>>>>>> a hang in drm_sched_entity_flush. This is because >>>>>>>> drm_sched_entity_is_idle >>>>>>>> never becomes false. >>>>>>>> >>>>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>>>> scheduler's run queues. This will satisfy >>>>>>>> drm_sched_entity_is_idle. >>>>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>>>> as the scheduler main thread which wakes them up is stopped by >>>>>>>> now. >>>>>>>> >>>>>>>> v2: >>>>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>>>> to race. >>>>>>>> >>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>>>> +++++++++++++++++++++++++++++++ >>>>>>>> 1 file changed, 31 insertions(+) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> index 908b0b5..c6b7947 100644 >>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>>>> */ >>>>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>>>> { >>>>>>>> + int i; >>>>>>>> + struct drm_sched_entity *s_entity; >>>>>>> >>>>>>> BTW: Please order that so that i is declared last. >>>>>>> >>>>>>>> if (sched->thread) >>>>>>>> kthread_stop(sched->thread); >>>>>>>> + /* Detach all sched_entites from this scheduler once >>>>>>>> it's stopped */ >>>>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>>>> + >>>>>>>> + if (!rq) >>>>>>>> + continue; >>>>>>>> + >>>>>>>> + /* Loop this way because rq->lock is taken in >>>>>>>> drm_sched_rq_remove_entity */ >>>>>>>> + spin_lock(&rq->lock); >>>>>>>> + while ((s_entity = >>>>>>>> list_first_entry_or_null(&rq->entities, >>>>>>>> + struct drm_sched_entity, >>>>>>>> + list))) { >>>>>>>> + spin_unlock(&rq->lock); >>>>>>>> + >>>>>>>> + /* Prevent reinsertion and remove */ >>>>>>>> + spin_lock(&s_entity->rq_lock); >>>>>>>> + s_entity->stopped = true; >>>>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>>>> + spin_unlock(&s_entity->rq_lock); >>>>>>> >>>>>>> Well this spin_unlock/lock dance here doesn't look correct at >>>>>>> all now. >>>>>>> >>>>>>> Christian. >>>>>> >>>>>> >>>>>> In what way ? It's in the same same order as in other call sites >>>>>> (see drm_sched_entity_push_job and drm_sched_entity_flush). >>>>>> If i just locked rq->lock and did list_for_each_entry_safe while >>>>>> manually deleting entity->list instead of calling >>>>>> drm_sched_rq_remove_entity this still would not be possible as >>>>>> the order of lock acquisition between s_entity->rq_lock >>>>>> and rq->lock would be reverse compared to the call sites >>>>>> mentioned above. >>>>> >>>>> Ah, now I understand. You need this because >>>>> drm_sched_rq_remove_entity() will grab the rq lock again! >>>>> >>>>> Problem is now what prevents the entity from being destroyed while >>>>> you remove it? >>>>> >>>>> Christian. >>>> >>>> Right, well, since (unfortunately) sched_entity is part of >>>> amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted >>>> there is a problem here that we don't increment amdgpu_ctx.refcount >>>> when assigning sched_entity >>>> to new rq (e.g. before drm_sched_rq_add_entity) and not decrement >>>> before removing. We do it for >>>> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and >>>> amdgpu_cs_parser_fini by >>>> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit >>>> tricky due to all the drm_sched_entity_select_rq >>>> logic. >>>> >>>> Another, kind of a band aid fix, would probably be just locking >>>> amdgpu_ctx_mgr.lock around drm_sched_fini >>>> when finalizing the fence driver and around idr iteration in >>>> amdgpu_ctx_mgr_fini (which should be lock protected >>>> anyway as I see from other idr usages in the code) ... This should >>>> prevent this use after free. >>> >>> Puh, that's rather complicated as well. Ok let's look at it from the >>> other side for a moment. >>> >>> Why do we have to remove the entities from the rq in the first place? >>> >>> Wouldn't it be sufficient to just set all of them to stopped? >>> >>> Christian. >> >> >> And adding it as another condition in drm_sched_entity_is_idle ? Yes, I think that should work. Christian. >> >> Andrey >> >> >>> >>>> >>>> Andrey >>>> >>>> >>>>> >>>>>> >>>>>> Andrey >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>>> + >>>>>>>> + spin_lock(&rq->lock); >>>>>>>> + } >>>>>>>> + spin_unlock(&rq->lock); >>>>>>>> + >>>>>>>> + } >>>>>>>> + >>>>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for >>>>>>>> this scheduler */ >>>>>>>> + wake_up_all(&sched->job_scheduled); >>>>>>>> + >>>>>>>> /* Confirm no work left behind accessing device >>>>>>>> structures */ >>>>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>>>> >>>>> >>>
On 2021-02-25 2:53 a.m., Christian König wrote: > Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: >> Ping > > Sorry, I've been on vacation this week. > >> >> Andrey >> >> On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: >>> >>> On 2/20/21 3:38 AM, Christian König wrote: >>>> >>>> >>>> Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >>>>> >>>>> On 2/18/21 10:15 AM, Christian König wrote: >>>>>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>>>>> >>>>>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>>>>> >>>>>>>> >>>>>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>>>>> Problem: If scheduler is already stopped by the time sched_entity >>>>>>>>> is released and entity's job_queue not empty I encountred >>>>>>>>> a hang in drm_sched_entity_flush. This is because >>>>>>>>> drm_sched_entity_is_idle >>>>>>>>> never becomes false. >>>>>>>>> >>>>>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>>>>> scheduler's run queues. This will satisfy >>>>>>>>> drm_sched_entity_is_idle. >>>>>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>>>>> as the scheduler main thread which wakes them up is stopped by >>>>>>>>> now. >>>>>>>>> >>>>>>>>> v2: >>>>>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>>>>> to race. >>>>>>>>> >>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>>>>> +++++++++++++++++++++++++++++++ >>>>>>>>> 1 file changed, 31 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>> index 908b0b5..c6b7947 100644 >>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>>>>> */ >>>>>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>>>>> { >>>>>>>>> + int i; >>>>>>>>> + struct drm_sched_entity *s_entity; >>>>>>>> >>>>>>>> BTW: Please order that so that i is declared last. >>>>>>>> >>>>>>>>> if (sched->thread) >>>>>>>>> kthread_stop(sched->thread); >>>>>>>>> + /* Detach all sched_entites from this scheduler once >>>>>>>>> it's stopped */ >>>>>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>>>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>>>>> + >>>>>>>>> + if (!rq) >>>>>>>>> + continue; >>>>>>>>> + >>>>>>>>> + /* Loop this way because rq->lock is taken in >>>>>>>>> drm_sched_rq_remove_entity */ >>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>> + while ((s_entity = >>>>>>>>> list_first_entry_or_null(&rq->entities, >>>>>>>>> + struct drm_sched_entity, >>>>>>>>> + list))) { >>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>> + >>>>>>>>> + /* Prevent reinsertion and remove */ >>>>>>>>> + spin_lock(&s_entity->rq_lock); >>>>>>>>> + s_entity->stopped = true; >>>>>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>>>>> + spin_unlock(&s_entity->rq_lock); >>>>>>>> >>>>>>>> Well this spin_unlock/lock dance here doesn't look correct at >>>>>>>> all now. >>>>>>>> >>>>>>>> Christian. >>>>>>> >>>>>>> >>>>>>> In what way ? It's in the same same order as in other call sites >>>>>>> (see drm_sched_entity_push_job and drm_sched_entity_flush). >>>>>>> If i just locked rq->lock and did list_for_each_entry_safe while >>>>>>> manually deleting entity->list instead of calling >>>>>>> drm_sched_rq_remove_entity this still would not be possible as >>>>>>> the order of lock acquisition between s_entity->rq_lock >>>>>>> and rq->lock would be reverse compared to the call sites >>>>>>> mentioned above. >>>>>> >>>>>> Ah, now I understand. You need this because >>>>>> drm_sched_rq_remove_entity() will grab the rq lock again! >>>>>> >>>>>> Problem is now what prevents the entity from being destroyed >>>>>> while you remove it? >>>>>> >>>>>> Christian. >>>>> >>>>> Right, well, since (unfortunately) sched_entity is part of >>>>> amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted >>>>> there is a problem here that we don't increment >>>>> amdgpu_ctx.refcount when assigning sched_entity >>>>> to new rq (e.g. before drm_sched_rq_add_entity) and not decrement >>>>> before removing. We do it for >>>>> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and >>>>> amdgpu_cs_parser_fini by >>>>> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit >>>>> tricky due to all the drm_sched_entity_select_rq >>>>> logic. >>>>> >>>>> Another, kind of a band aid fix, would probably be just locking >>>>> amdgpu_ctx_mgr.lock around drm_sched_fini >>>>> when finalizing the fence driver and around idr iteration in >>>>> amdgpu_ctx_mgr_fini (which should be lock protected >>>>> anyway as I see from other idr usages in the code) ... This should >>>>> prevent this use after free. >>>> >>>> Puh, that's rather complicated as well. Ok let's look at it from >>>> the other side for a moment. >>>> >>>> Why do we have to remove the entities from the rq in the first place? >>>> >>>> Wouldn't it be sufficient to just set all of them to stopped? >>>> >>>> Christian. >>> >>> >>> And adding it as another condition in drm_sched_entity_is_idle ? > > Yes, I think that should work. In this case reverse locking order is created, In drm_sched_entity_push_job and drm_sched_entity_flush lock entity->rq_lock locked first and rq->lock locked second. In drm_sched_fini on the other hand, I need to lock rq->lock first to iterate rq->entities and then lock s_entity->rq_lock for each entity to modify s_entity->stopped. I guess we could change s_entity->stopped to atomic ? Andrey > > Christian. > > >>> >>> Andrey >>> >>> >>>> >>>>> >>>>> Andrey >>>>> >>>>> >>>>>> >>>>>>> >>>>>>> Andrey >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>> + >>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>> + } >>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>> + >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for >>>>>>>>> this scheduler */ >>>>>>>>> + wake_up_all(&sched->job_scheduled); >>>>>>>>> + >>>>>>>>> /* Confirm no work left behind accessing device >>>>>>>>> structures */ >>>>>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>>>>> >>>>>> >>>> >
Am 25.02.21 um 17:03 schrieb Andrey Grodzovsky: > > On 2021-02-25 2:53 a.m., Christian König wrote: >> Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: >>> Ping >> >> Sorry, I've been on vacation this week. >> >>> >>> Andrey >>> >>> On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: >>>> >>>> On 2/20/21 3:38 AM, Christian König wrote: >>>>> >>>>> >>>>> Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >>>>>> >>>>>> On 2/18/21 10:15 AM, Christian König wrote: >>>>>>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>>>>>> >>>>>>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>>>>>> Problem: If scheduler is already stopped by the time >>>>>>>>>> sched_entity >>>>>>>>>> is released and entity's job_queue not empty I encountred >>>>>>>>>> a hang in drm_sched_entity_flush. This is because >>>>>>>>>> drm_sched_entity_is_idle >>>>>>>>>> never becomes false. >>>>>>>>>> >>>>>>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>>>>>> scheduler's run queues. This will satisfy >>>>>>>>>> drm_sched_entity_is_idle. >>>>>>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>>>>>> as the scheduler main thread which wakes them up is stopped >>>>>>>>>> by now. >>>>>>>>>> >>>>>>>>>> v2: >>>>>>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>>>>>> to race. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>>>>>> --- >>>>>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>>>>>> +++++++++++++++++++++++++++++++ >>>>>>>>>> 1 file changed, 31 insertions(+) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>> index 908b0b5..c6b7947 100644 >>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>>>>>> */ >>>>>>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>>>>>> { >>>>>>>>>> + int i; >>>>>>>>>> + struct drm_sched_entity *s_entity; >>>>>>>>> >>>>>>>>> BTW: Please order that so that i is declared last. >>>>>>>>> >>>>>>>>>> if (sched->thread) >>>>>>>>>> kthread_stop(sched->thread); >>>>>>>>>> + /* Detach all sched_entites from this scheduler once >>>>>>>>>> it's stopped */ >>>>>>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>>>>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>>>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>>>>>> + >>>>>>>>>> + if (!rq) >>>>>>>>>> + continue; >>>>>>>>>> + >>>>>>>>>> + /* Loop this way because rq->lock is taken in >>>>>>>>>> drm_sched_rq_remove_entity */ >>>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>>> + while ((s_entity = >>>>>>>>>> list_first_entry_or_null(&rq->entities, >>>>>>>>>> + struct drm_sched_entity, >>>>>>>>>> + list))) { >>>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>>> + >>>>>>>>>> + /* Prevent reinsertion and remove */ >>>>>>>>>> + spin_lock(&s_entity->rq_lock); >>>>>>>>>> + s_entity->stopped = true; >>>>>>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>>>>>> + spin_unlock(&s_entity->rq_lock); >>>>>>>>> >>>>>>>>> Well this spin_unlock/lock dance here doesn't look correct at >>>>>>>>> all now. >>>>>>>>> >>>>>>>>> Christian. >>>>>>>> >>>>>>>> >>>>>>>> In what way ? It's in the same same order as in other call >>>>>>>> sites (see drm_sched_entity_push_job and drm_sched_entity_flush). >>>>>>>> If i just locked rq->lock and did list_for_each_entry_safe >>>>>>>> while manually deleting entity->list instead of calling >>>>>>>> drm_sched_rq_remove_entity this still would not be possible as >>>>>>>> the order of lock acquisition between s_entity->rq_lock >>>>>>>> and rq->lock would be reverse compared to the call sites >>>>>>>> mentioned above. >>>>>>> >>>>>>> Ah, now I understand. You need this because >>>>>>> drm_sched_rq_remove_entity() will grab the rq lock again! >>>>>>> >>>>>>> Problem is now what prevents the entity from being destroyed >>>>>>> while you remove it? >>>>>>> >>>>>>> Christian. >>>>>> >>>>>> Right, well, since (unfortunately) sched_entity is part of >>>>>> amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted >>>>>> there is a problem here that we don't increment >>>>>> amdgpu_ctx.refcount when assigning sched_entity >>>>>> to new rq (e.g. before drm_sched_rq_add_entity) and not decrement >>>>>> before removing. We do it for >>>>>> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and >>>>>> amdgpu_cs_parser_fini by >>>>>> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit >>>>>> tricky due to all the drm_sched_entity_select_rq >>>>>> logic. >>>>>> >>>>>> Another, kind of a band aid fix, would probably be just locking >>>>>> amdgpu_ctx_mgr.lock around drm_sched_fini >>>>>> when finalizing the fence driver and around idr iteration in >>>>>> amdgpu_ctx_mgr_fini (which should be lock protected >>>>>> anyway as I see from other idr usages in the code) ... This >>>>>> should prevent this use after free. >>>>> >>>>> Puh, that's rather complicated as well. Ok let's look at it from >>>>> the other side for a moment. >>>>> >>>>> Why do we have to remove the entities from the rq in the first place? >>>>> >>>>> Wouldn't it be sufficient to just set all of them to stopped? >>>>> >>>>> Christian. >>>> >>>> >>>> And adding it as another condition in drm_sched_entity_is_idle ? >> >> Yes, I think that should work. > > > In this case reverse locking order is created, In > drm_sched_entity_push_job and drm_sched_entity_flush lock > entity->rq_lock locked first and rq->lock locked second. In > drm_sched_fini on the other hand, I need to lock rq->lock first to > iterate rq->entities and then lock s_entity->rq_lock for each entity > to modify s_entity->stopped. I guess we could change s_entity->stopped > to atomic ? Good point. But I think the memory barrier inserted by wake_up_all(&sched->job_scheduled); should be sufficient. If I see this correctly we have the entity->rq_lock mainly to protect concurrent changes of entity->rq. But when two CPUs both set entity->stopped to true at the same time we don't really care about it as long drm_sched_entity_is_idle() sees it. Regards, Christian. > > Andrey > > >> >> Christian. >> >> >>>> >>>> Andrey >>>> >>>> >>>>> >>>>>> >>>>>> Andrey >>>>>> >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Andrey >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> + >>>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>>> + } >>>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>>> + >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for >>>>>>>>>> this scheduler */ >>>>>>>>>> + wake_up_all(&sched->job_scheduled); >>>>>>>>>> + >>>>>>>>>> /* Confirm no work left behind accessing device >>>>>>>>>> structures */ >>>>>>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>>>>>> >>>>>>> >>>>> >>
On 2021-02-25 1:42 p.m., Christian König wrote: > > > Am 25.02.21 um 17:03 schrieb Andrey Grodzovsky: >> >> On 2021-02-25 2:53 a.m., Christian König wrote: >>> Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: >>>> Ping >>> >>> Sorry, I've been on vacation this week. >>> >>>> >>>> Andrey >>>> >>>> On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: >>>>> >>>>> On 2/20/21 3:38 AM, Christian König wrote: >>>>>> >>>>>> >>>>>> Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >>>>>>> >>>>>>> On 2/18/21 10:15 AM, Christian König wrote: >>>>>>>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>>>>>>> >>>>>>>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>>>>>>> Problem: If scheduler is already stopped by the time >>>>>>>>>>> sched_entity >>>>>>>>>>> is released and entity's job_queue not empty I encountred >>>>>>>>>>> a hang in drm_sched_entity_flush. This is because >>>>>>>>>>> drm_sched_entity_is_idle >>>>>>>>>>> never becomes false. >>>>>>>>>>> >>>>>>>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>>>>>>> scheduler's run queues. This will satisfy >>>>>>>>>>> drm_sched_entity_is_idle. >>>>>>>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>>>>>>> as the scheduler main thread which wakes them up is stopped >>>>>>>>>>> by now. >>>>>>>>>>> >>>>>>>>>>> v2: >>>>>>>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>>>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>>>>>>> to race. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>>>>>>> +++++++++++++++++++++++++++++++ >>>>>>>>>>> 1 file changed, 31 insertions(+) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>> index 908b0b5..c6b7947 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>>>>>>> */ >>>>>>>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>>>>>>> { >>>>>>>>>>> + int i; >>>>>>>>>>> + struct drm_sched_entity *s_entity; >>>>>>>>>> >>>>>>>>>> BTW: Please order that so that i is declared last. >>>>>>>>>> >>>>>>>>>>> if (sched->thread) >>>>>>>>>>> kthread_stop(sched->thread); >>>>>>>>>>> + /* Detach all sched_entites from this scheduler once >>>>>>>>>>> it's stopped */ >>>>>>>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>>>>>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>>>>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>>>>>>> + >>>>>>>>>>> + if (!rq) >>>>>>>>>>> + continue; >>>>>>>>>>> + >>>>>>>>>>> + /* Loop this way because rq->lock is taken in >>>>>>>>>>> drm_sched_rq_remove_entity */ >>>>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>>>> + while ((s_entity = >>>>>>>>>>> list_first_entry_or_null(&rq->entities, >>>>>>>>>>> + struct drm_sched_entity, >>>>>>>>>>> + list))) { >>>>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>>>> + >>>>>>>>>>> + /* Prevent reinsertion and remove */ >>>>>>>>>>> + spin_lock(&s_entity->rq_lock); >>>>>>>>>>> + s_entity->stopped = true; >>>>>>>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>>>>>>> + spin_unlock(&s_entity->rq_lock); >>>>>>>>>> >>>>>>>>>> Well this spin_unlock/lock dance here doesn't look correct at >>>>>>>>>> all now. >>>>>>>>>> >>>>>>>>>> Christian. >>>>>>>>> >>>>>>>>> >>>>>>>>> In what way ? It's in the same same order as in other call >>>>>>>>> sites (see drm_sched_entity_push_job and drm_sched_entity_flush). >>>>>>>>> If i just locked rq->lock and did list_for_each_entry_safe >>>>>>>>> while manually deleting entity->list instead of calling >>>>>>>>> drm_sched_rq_remove_entity this still would not be possible as >>>>>>>>> the order of lock acquisition between s_entity->rq_lock >>>>>>>>> and rq->lock would be reverse compared to the call sites >>>>>>>>> mentioned above. >>>>>>>> >>>>>>>> Ah, now I understand. You need this because >>>>>>>> drm_sched_rq_remove_entity() will grab the rq lock again! >>>>>>>> >>>>>>>> Problem is now what prevents the entity from being destroyed >>>>>>>> while you remove it? >>>>>>>> >>>>>>>> Christian. >>>>>>> >>>>>>> Right, well, since (unfortunately) sched_entity is part of >>>>>>> amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted >>>>>>> there is a problem here that we don't increment >>>>>>> amdgpu_ctx.refcount when assigning sched_entity >>>>>>> to new rq (e.g. before drm_sched_rq_add_entity) and not >>>>>>> decrement before removing. We do it for >>>>>>> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init >>>>>>> and amdgpu_cs_parser_fini by >>>>>>> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit >>>>>>> tricky due to all the drm_sched_entity_select_rq >>>>>>> logic. >>>>>>> >>>>>>> Another, kind of a band aid fix, would probably be just locking >>>>>>> amdgpu_ctx_mgr.lock around drm_sched_fini >>>>>>> when finalizing the fence driver and around idr iteration in >>>>>>> amdgpu_ctx_mgr_fini (which should be lock protected >>>>>>> anyway as I see from other idr usages in the code) ... This >>>>>>> should prevent this use after free. >>>>>> >>>>>> Puh, that's rather complicated as well. Ok let's look at it from >>>>>> the other side for a moment. >>>>>> >>>>>> Why do we have to remove the entities from the rq in the first >>>>>> place? >>>>>> >>>>>> Wouldn't it be sufficient to just set all of them to stopped? >>>>>> >>>>>> Christian. >>>>> >>>>> >>>>> And adding it as another condition in drm_sched_entity_is_idle ? >>> >>> Yes, I think that should work. >> >> >> In this case reverse locking order is created, In >> drm_sched_entity_push_job and drm_sched_entity_flush lock >> entity->rq_lock locked first and rq->lock locked second. In >> drm_sched_fini on the other hand, I need to lock rq->lock first to >> iterate rq->entities and then lock s_entity->rq_lock for each entity >> to modify s_entity->stopped. I guess we could change >> s_entity->stopped to atomic ? > > Good point. But I think the memory barrier inserted by > wake_up_all(&sched->job_scheduled); should be sufficient. > > If I see this correctly we have the entity->rq_lock mainly to protect > concurrent changes of entity->rq. > > But when two CPUs both set entity->stopped to true at the same time we > don't really care about it as long drm_sched_entity_is_idle() sees it. > > Regards, > Christian. I was more thinking about integrity of reading/writing entity->stopped from different threads. I guess since it's bool it's guaranteed to be atomic from HW perspective anyway ? Will send V3 soon. Andrey > >> >> Andrey >> >> >>> >>> Christian. >>> >>> >>>>> >>>>> Andrey >>>>> >>>>> >>>>>> >>>>>>> >>>>>>> Andrey >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Andrey >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> + >>>>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>>>> + } >>>>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>>>> + >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for >>>>>>>>>>> this scheduler */ >>>>>>>>>>> + wake_up_all(&sched->job_scheduled); >>>>>>>>>>> + >>>>>>>>>>> /* Confirm no work left behind accessing device >>>>>>>>>>> structures */ >>>>>>>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>>>>>>> >>>>>>>> >>>>>> >>> >
Am 25.02.21 um 22:27 schrieb Andrey Grodzovsky: > > On 2021-02-25 1:42 p.m., Christian König wrote: >> >> >> Am 25.02.21 um 17:03 schrieb Andrey Grodzovsky: >>> >>> On 2021-02-25 2:53 a.m., Christian König wrote: >>>> Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: >>>>> Ping >>>> >>>> Sorry, I've been on vacation this week. >>>> >>>>> >>>>> Andrey >>>>> >>>>> On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: >>>>>> >>>>>> On 2/20/21 3:38 AM, Christian König wrote: >>>>>>> >>>>>>> >>>>>>> Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: >>>>>>>> >>>>>>>> On 2/18/21 10:15 AM, Christian König wrote: >>>>>>>>> Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: >>>>>>>>>> >>>>>>>>>> On 2/18/21 3:07 AM, Christian König wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: >>>>>>>>>>>> Problem: If scheduler is already stopped by the time >>>>>>>>>>>> sched_entity >>>>>>>>>>>> is released and entity's job_queue not empty I encountred >>>>>>>>>>>> a hang in drm_sched_entity_flush. This is because >>>>>>>>>>>> drm_sched_entity_is_idle >>>>>>>>>>>> never becomes false. >>>>>>>>>>>> >>>>>>>>>>>> Fix: In drm_sched_fini detach all sched_entities from the >>>>>>>>>>>> scheduler's run queues. This will satisfy >>>>>>>>>>>> drm_sched_entity_is_idle. >>>>>>>>>>>> Also wakeup all those processes stuck in sched_entity flushing >>>>>>>>>>>> as the scheduler main thread which wakes them up is stopped >>>>>>>>>>>> by now. >>>>>>>>>>>> >>>>>>>>>>>> v2: >>>>>>>>>>>> Reverse order of drm_sched_rq_remove_entity and marking >>>>>>>>>>>> s_entity as stopped to prevent reinserion back to rq due >>>>>>>>>>>> to race. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>>>>>>>>>> --- >>>>>>>>>>>> drivers/gpu/drm/scheduler/sched_main.c | 31 >>>>>>>>>>>> +++++++++++++++++++++++++++++++ >>>>>>>>>>>> 1 file changed, 31 insertions(+) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>>> index 908b0b5..c6b7947 100644 >>>>>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>>>>>>>> @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); >>>>>>>>>>>> */ >>>>>>>>>>>> void drm_sched_fini(struct drm_gpu_scheduler *sched) >>>>>>>>>>>> { >>>>>>>>>>>> + int i; >>>>>>>>>>>> + struct drm_sched_entity *s_entity; >>>>>>>>>>> >>>>>>>>>>> BTW: Please order that so that i is declared last. >>>>>>>>>>> >>>>>>>>>>>> if (sched->thread) >>>>>>>>>>>> kthread_stop(sched->thread); >>>>>>>>>>>> + /* Detach all sched_entites from this scheduler once >>>>>>>>>>>> it's stopped */ >>>>>>>>>>>> + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= >>>>>>>>>>>> DRM_SCHED_PRIORITY_MIN; i--) { >>>>>>>>>>>> + struct drm_sched_rq *rq = &sched->sched_rq[i]; >>>>>>>>>>>> + >>>>>>>>>>>> + if (!rq) >>>>>>>>>>>> + continue; >>>>>>>>>>>> + >>>>>>>>>>>> + /* Loop this way because rq->lock is taken in >>>>>>>>>>>> drm_sched_rq_remove_entity */ >>>>>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>>>>> + while ((s_entity = >>>>>>>>>>>> list_first_entry_or_null(&rq->entities, >>>>>>>>>>>> + struct drm_sched_entity, >>>>>>>>>>>> + list))) { >>>>>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>>>>> + >>>>>>>>>>>> + /* Prevent reinsertion and remove */ >>>>>>>>>>>> + spin_lock(&s_entity->rq_lock); >>>>>>>>>>>> + s_entity->stopped = true; >>>>>>>>>>>> + drm_sched_rq_remove_entity(rq, s_entity); >>>>>>>>>>>> + spin_unlock(&s_entity->rq_lock); >>>>>>>>>>> >>>>>>>>>>> Well this spin_unlock/lock dance here doesn't look correct >>>>>>>>>>> at all now. >>>>>>>>>>> >>>>>>>>>>> Christian. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> In what way ? It's in the same same order as in other call >>>>>>>>>> sites (see drm_sched_entity_push_job and >>>>>>>>>> drm_sched_entity_flush). >>>>>>>>>> If i just locked rq->lock and did list_for_each_entry_safe >>>>>>>>>> while manually deleting entity->list instead of calling >>>>>>>>>> drm_sched_rq_remove_entity this still would not be possible >>>>>>>>>> as the order of lock acquisition between s_entity->rq_lock >>>>>>>>>> and rq->lock would be reverse compared to the call sites >>>>>>>>>> mentioned above. >>>>>>>>> >>>>>>>>> Ah, now I understand. You need this because >>>>>>>>> drm_sched_rq_remove_entity() will grab the rq lock again! >>>>>>>>> >>>>>>>>> Problem is now what prevents the entity from being destroyed >>>>>>>>> while you remove it? >>>>>>>>> >>>>>>>>> Christian. >>>>>>>> >>>>>>>> Right, well, since (unfortunately) sched_entity is part of >>>>>>>> amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted >>>>>>>> there is a problem here that we don't increment >>>>>>>> amdgpu_ctx.refcount when assigning sched_entity >>>>>>>> to new rq (e.g. before drm_sched_rq_add_entity) and not >>>>>>>> decrement before removing. We do it for >>>>>>>> amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init >>>>>>>> and amdgpu_cs_parser_fini by >>>>>>>> calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a >>>>>>>> bit tricky due to all the drm_sched_entity_select_rq >>>>>>>> logic. >>>>>>>> >>>>>>>> Another, kind of a band aid fix, would probably be just locking >>>>>>>> amdgpu_ctx_mgr.lock around drm_sched_fini >>>>>>>> when finalizing the fence driver and around idr iteration in >>>>>>>> amdgpu_ctx_mgr_fini (which should be lock protected >>>>>>>> anyway as I see from other idr usages in the code) ... This >>>>>>>> should prevent this use after free. >>>>>>> >>>>>>> Puh, that's rather complicated as well. Ok let's look at it from >>>>>>> the other side for a moment. >>>>>>> >>>>>>> Why do we have to remove the entities from the rq in the first >>>>>>> place? >>>>>>> >>>>>>> Wouldn't it be sufficient to just set all of them to stopped? >>>>>>> >>>>>>> Christian. >>>>>> >>>>>> >>>>>> And adding it as another condition in drm_sched_entity_is_idle ? >>>> >>>> Yes, I think that should work. >>> >>> >>> In this case reverse locking order is created, In >>> drm_sched_entity_push_job and drm_sched_entity_flush lock >>> entity->rq_lock locked first and rq->lock locked second. In >>> drm_sched_fini on the other hand, I need to lock rq->lock first to >>> iterate rq->entities and then lock s_entity->rq_lock for each entity >>> to modify s_entity->stopped. I guess we could change >>> s_entity->stopped to atomic ? >> >> Good point. But I think the memory barrier inserted by >> wake_up_all(&sched->job_scheduled); should be sufficient. >> >> If I see this correctly we have the entity->rq_lock mainly to protect >> concurrent changes of entity->rq. >> >> But when two CPUs both set entity->stopped to true at the same time >> we don't really care about it as long drm_sched_entity_is_idle() sees >> it. >> >> Regards, >> Christian. > > > I was more thinking about integrity of reading/writing entity->stopped > from different threads. I guess since it's bool it's guaranteed to be > atomic from HW perspective anyway ? More or less yes. The key point is that we only change it from false -> true and never the other way around. All that you need then for other CPUs to see the value is a barrier. Christian. > Will send V3 soon. > > Andrey > > >> >>> >>> Andrey >>> >>> >>>> >>>> Christian. >>>> >>>> >>>>>> >>>>>> Andrey >>>>>> >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Andrey >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Andrey >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> + >>>>>>>>>>>> + spin_lock(&rq->lock); >>>>>>>>>>>> + } >>>>>>>>>>>> + spin_unlock(&rq->lock); >>>>>>>>>>>> + >>>>>>>>>>>> + } >>>>>>>>>>>> + >>>>>>>>>>>> + /* Wakeup everyone stuck in drm_sched_entity_flush for >>>>>>>>>>>> this scheduler */ >>>>>>>>>>>> + wake_up_all(&sched->job_scheduled); >>>>>>>>>>>> + >>>>>>>>>>>> /* Confirm no work left behind accessing device >>>>>>>>>>>> structures */ >>>>>>>>>>>> cancel_delayed_work_sync(&sched->work_tdr); >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> >>
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 908b0b5..c6b7947 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { + int i; + struct drm_sched_entity *s_entity; if (sched->thread) kthread_stop(sched->thread); + /* Detach all sched_entites from this scheduler once it's stopped */ + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + struct drm_sched_rq *rq = &sched->sched_rq[i]; + + if (!rq) + continue; + + /* Loop this way because rq->lock is taken in drm_sched_rq_remove_entity */ + spin_lock(&rq->lock); + while ((s_entity = list_first_entry_or_null(&rq->entities, + struct drm_sched_entity, + list))) { + spin_unlock(&rq->lock); + + /* Prevent reinsertion and remove */ + spin_lock(&s_entity->rq_lock); + s_entity->stopped = true; + drm_sched_rq_remove_entity(rq, s_entity); + spin_unlock(&s_entity->rq_lock); + + spin_lock(&rq->lock); + } + spin_unlock(&rq->lock); + + } + + /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ + wake_up_all(&sched->job_scheduled); + /* Confirm no work left behind accessing device structures */ cancel_delayed_work_sync(&sched->work_tdr);
Problem: If scheduler is already stopped by the time sched_entity is released and entity's job_queue not empty I encountred a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle never becomes false. Fix: In drm_sched_fini detach all sched_entities from the scheduler's run queues. This will satisfy drm_sched_entity_is_idle. Also wakeup all those processes stuck in sched_entity flushing as the scheduler main thread which wakes them up is stopped by now. v2: Reverse order of drm_sched_rq_remove_entity and marking s_entity as stopped to prevent reinserion back to rq due to race. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> --- drivers/gpu/drm/scheduler/sched_main.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)