[v3,01/20] drm/sched: entity->rq selection cannot fail

Message ID	20210708173754.3877540-2-daniel.vetter@ffwll.ch (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=T8r4=MA=lists.freedesktop.org=dri-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2232961920 From: Daniel Vetter <daniel.vetter@ffwll.ch> To: DRI Development <dri-devel@lists.freedesktop.org> Subject: [PATCH v3 01/20] drm/sched: entity->rq selection cannot fail Date: Thu, 8 Jul 2021 19:37:35 +0200 Message-Id: <20210708173754.3877540-2-daniel.vetter@ffwll.ch> In-Reply-To: <20210708173754.3877540-1-daniel.vetter@ffwll.ch> References: <20210708173754.3877540-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Cc: Jack Zhang <Jack.Zhang1@amd.com>, Daniel Vetter <daniel.vetter@ffwll.ch>, Intel Graphics Development <intel-gfx@lists.freedesktop.org>, Steven Price <steven.price@arm.com>, Luben Tuikov <luben.tuikov@amd.com>, Boris Brezillon <boris.brezillon@collabora.com>, Daniel Vetter <daniel.vetter@intel.com>, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com> Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/sched dependency tracking and dma-resv fixes \| expand [v3,00/20] drm/sched dependency tracking and dma-resv fixes [v3,01/20] drm/sched: entity->rq selection cannot fail [v3,02/20] drm/sched: Split drm_sched_job_init [v3,03/20] drm/sched: Barriers are needed for entity->last_scheduled [v3,04/20] drm/sched: Add dependency tracking [v3,05/20] drm/sched: drop entity parameter from drm_sched_push_job [v3,06/20] drm/sched: improve docs around drm_sched_entity [v3,07/20] drm/panfrost: use scheduler dependency tracking [v3,08/20] drm/lima: use scheduler dependency tracking [v3,09/20] drm/v3d: Move drm_sched_job_init to v3d_job_init [v3,10/20] drm/v3d: Use scheduler dependency handling [v3,11/20] drm/etnaviv: Use scheduler dependency handling [v3,12/20] drm/gem: Delete gem array fencing helpers [v3,13/20] drm/sched: Don't store self-dependencies [v3,14/20] drm/sched: Check locking in drm_sched_job_await_implicit [v3,15/20] drm/msm: Don't break exclusive fence ordering [v3,16/20] drm/msm: always wait for the exclusive fence [v3,17/20] drm/etnaviv: Don't break exclusive fence ordering [v3,18/20] drm/i915: delete exclude argument from i915_sw_fence_await_reservation [v3,19/20] drm/i915: Don't break exclusive fence ordering [v3,20/20] dma-resv: Give the docs a do-over

Daniel Vetter July 8, 2021, 5:37 p.m. UTC

If it does, someone managed to set up a sched_entity without
schedulers, which is just a driver bug.

We BUG_ON() here because in the next patch drm_sched_job_init() will
be split up, with drm_sched_job_arm() never failing. And that's the
part where the rq selection will end up in.

Note that if having an empty sched_list set on an entity is indeed a
valid use-case, we can keep that check in job_init even after the split
into job_init/arm.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Steven Price <steven.price@arm.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
 drivers/gpu/drm/scheduler/sched_main.c   | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

Christian König July 9, 2021, 6:53 a.m. UTC | #1

Am 08.07.21 um 19:37 schrieb Daniel Vetter:
> If it does, someone managed to set up a sched_entity without
> schedulers, which is just a driver bug.

NAK, it is perfectly valid for rq selection to fail.

See drm_sched_pick_best():

                 if (!sched->ready) {
                         DRM_WARN("scheduler %s is not ready, skipping",
                                  sched->name);
                         continue;
                 }

This can happen when a device reset fails for some engine.

Regards,
Christian.

>
> We BUG_ON() here because in the next patch drm_sched_job_init() will
> be split up, with drm_sched_job_arm() never failing. And that's the
> part where the rq selection will end up in.
>
> Note that if having an empty sched_list set on an entity is indeed a
> valid use-case, we can keep that check in job_init even after the split
> into job_init/arm.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
>   drivers/gpu/drm/scheduler/sched_main.c   | 3 +--
>   2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 79554aa4dbb1..6fc116ee7302 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -45,7 +45,7 @@
>    * @guilty: atomic_t set to 1 when a job on this queue
>    *          is found to be guilty causing a timeout
>    *
> - * Note: the sched_list should have at least one element to schedule
> + * Note: the sched_list must have at least one element to schedule
>    *       the entity
>    *
>    * Returns 0 on success or a negative error code on failure.
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 33c414d55fab..01dd47154181 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -586,8 +586,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   	struct drm_gpu_scheduler *sched;
>   
>   	drm_sched_entity_select_rq(entity);
> -	if (!entity->rq)
> -		return -ENOENT;
> +	BUG_ON(!entity->rq);
>   
>   	sched = entity->rq->sched;
>

Daniel Vetter July 9, 2021, 7:14 a.m. UTC | #2

On Fri, Jul 9, 2021 at 8:53 AM Christian König <christian.koenig@amd.com> wrote:
> Am 08.07.21 um 19:37 schrieb Daniel Vetter:
> > If it does, someone managed to set up a sched_entity without
> > schedulers, which is just a driver bug.
>
> NAK, it is perfectly valid for rq selection to fail.

There isn't a better way to explain stuff to someone who's new to the
code and tries to improve it with docs than to NAK stuff with
incomplete explanations?

> See drm_sched_pick_best():
>
>                  if (!sched->ready) {
>                          DRM_WARN("scheduler %s is not ready, skipping",
>                                   sched->name);
>                          continue;
>                  }
>
> This can happen when a device reset fails for some engine.

Well yeah I didn't expect amdgpu to just change this directly, so I
didn't find it. Getting an ENOENT on a hw failure instead of an EIO is
a bit interesting semantics I guess, also what happens with the jobs
which raced against the scheduler not being ready? I'm not seeing any
checks for ready in the main scheduler logic so this at least looks
somewhat accidental as a side effect, also no other driver than amdgpu
communitcates that reset failed back to drm/sched like this. They seem
to just not, and I guess timeout on the next request will get us into
an endless reset loop?
-Daniel


>
> Regards,
> Christian.
>
> >
> > We BUG_ON() here because in the next patch drm_sched_job_init() will
> > be split up, with drm_sched_job_arm() never failing. And that's the
> > part where the rq selection will end up in.
> >
> > Note that if having an empty sched_list set on an entity is indeed a
> > valid use-case, we can keep that check in job_init even after the split
> > into job_init/arm.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > ---
> >   drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
> >   drivers/gpu/drm/scheduler/sched_main.c   | 3 +--
> >   2 files changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 79554aa4dbb1..6fc116ee7302 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -45,7 +45,7 @@
> >    * @guilty: atomic_t set to 1 when a job on this queue
> >    *          is found to be guilty causing a timeout
> >    *
> > - * Note: the sched_list should have at least one element to schedule
> > + * Note: the sched_list must have at least one element to schedule
> >    *       the entity
> >    *
> >    * Returns 0 on success or a negative error code on failure.
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 33c414d55fab..01dd47154181 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -586,8 +586,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >       struct drm_gpu_scheduler *sched;
> >
> >       drm_sched_entity_select_rq(entity);
> > -     if (!entity->rq)
> > -             return -ENOENT;
> > +     BUG_ON(!entity->rq);
> >
> >       sched = entity->rq->sched;
> >
>

Christian König July 9, 2021, 7:23 a.m. UTC | #3

Am 09.07.21 um 09:14 schrieb Daniel Vetter:
> On Fri, Jul 9, 2021 at 8:53 AM Christian König <christian.koenig@amd.com> wrote:
>> Am 08.07.21 um 19:37 schrieb Daniel Vetter:
>>> If it does, someone managed to set up a sched_entity without
>>> schedulers, which is just a driver bug.
>> NAK, it is perfectly valid for rq selection to fail.
> There isn't a better way to explain stuff to someone who's new to the
> code and tries to improve it with docs than to NAK stuff with
> incomplete explanations?

Well as far as I understand it a NAK means that the author has missed 
something important and needs to re-iterate.

It's just to say that we absolutely can't merge a patch or something 
will break.

>> See drm_sched_pick_best():
>>
>>                   if (!sched->ready) {
>>                           DRM_WARN("scheduler %s is not ready, skipping",
>>                                    sched->name);
>>                           continue;
>>                   }
>>
>> This can happen when a device reset fails for some engine.
> Well yeah I didn't expect amdgpu to just change this directly, so I
> didn't find it. Getting an ENOENT on a hw failure instead of an EIO is
> a bit interesting semantics I guess, also what happens with the jobs
> which raced against the scheduler not being ready? I'm not seeing any
> checks for ready in the main scheduler logic so this at least looks
> somewhat accidental as a side effect, also no other driver than amdgpu
> communitcates that reset failed back to drm/sched like this. They seem
> to just not, and I guess timeout on the next request will get us into
> an endless reset loop?

Correct. Key point is that there aren't any jobs which are currently 
scheduled.

When the ready flag is changed the scheduler is paused, e.g. the main 
thread is not running any more.

I'm pretty sure that all of this is horrible racy, but nobody really 
looked into the design from a higher level as far as I know.

Christian.



> -Daniel
>
>
>> Regards,
>> Christian.
>>
>>> We BUG_ON() here because in the next patch drm_sched_job_init() will
>>> be split up, with drm_sched_job_arm() never failing. And that's the
>>> part where the rq selection will end up in.
>>>
>>> Note that if having an empty sched_list set on an entity is indeed a
>>> valid use-case, we can keep that check in job_init even after the split
>>> into job_init/arm.
>>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Steven Price <steven.price@arm.com>
>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>> ---
>>>    drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
>>>    drivers/gpu/drm/scheduler/sched_main.c   | 3 +--
>>>    2 files changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 79554aa4dbb1..6fc116ee7302 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -45,7 +45,7 @@
>>>     * @guilty: atomic_t set to 1 when a job on this queue
>>>     *          is found to be guilty causing a timeout
>>>     *
>>> - * Note: the sched_list should have at least one element to schedule
>>> + * Note: the sched_list must have at least one element to schedule
>>>     *       the entity
>>>     *
>>>     * Returns 0 on success or a negative error code on failure.
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 33c414d55fab..01dd47154181 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -586,8 +586,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>        struct drm_gpu_scheduler *sched;
>>>
>>>        drm_sched_entity_select_rq(entity);
>>> -     if (!entity->rq)
>>> -             return -ENOENT;
>>> +     BUG_ON(!entity->rq);
>>>
>>>        sched = entity->rq->sched;
>>>
>

Daniel Vetter July 9, 2021, 8 a.m. UTC | #4

On Fri, Jul 9, 2021 at 9:23 AM Christian König <christian.koenig@amd.com> wrote:
> Am 09.07.21 um 09:14 schrieb Daniel Vetter:
> > On Fri, Jul 9, 2021 at 8:53 AM Christian König <christian.koenig@amd.com> wrote:
> >> Am 08.07.21 um 19:37 schrieb Daniel Vetter:
> >>> If it does, someone managed to set up a sched_entity without
> >>> schedulers, which is just a driver bug.
> >> NAK, it is perfectly valid for rq selection to fail.
> > There isn't a better way to explain stuff to someone who's new to the
> > code and tries to improve it with docs than to NAK stuff with
> > incomplete explanations?
>
> Well as far as I understand it a NAK means that the author has missed
> something important and needs to re-iterate.

It comes around as very screaming at least to me (all uppercase and
all that) and personally I only associate it with unchecked angry
kernel maintainers on lkml celebrating their status and putting down
some noobs for shits and giggles. I think here on dri-devel you're the
only one doing it regularly.

> It's just to say that we absolutely can't merge a patch or something
> will break.

Well yeah I know that when a patch breaks something I can't merge it.
For drm-intel we also documented that clearly, but for drm-misc it's
not spelled out. I'll fix that.

> >> See drm_sched_pick_best():
> >>
> >>                   if (!sched->ready) {
> >>                           DRM_WARN("scheduler %s is not ready, skipping",
> >>                                    sched->name);
> >>                           continue;
> >>                   }
> >>
> >> This can happen when a device reset fails for some engine.
> > Well yeah I didn't expect amdgpu to just change this directly, so I
> > didn't find it. Getting an ENOENT on a hw failure instead of an EIO is
> > a bit interesting semantics I guess, also what happens with the jobs
> > which raced against the scheduler not being ready? I'm not seeing any
> > checks for ready in the main scheduler logic so this at least looks
> > somewhat accidental as a side effect, also no other driver than amdgpu
> > communitcates that reset failed back to drm/sched like this. They seem
> > to just not, and I guess timeout on the next request will get us into
> > an endless reset loop?
>
> Correct. Key point is that there aren't any jobs which are currently
> scheduled.
>
> When the ready flag is changed the scheduler is paused, e.g. the main
> thread is not running any more.
>
> I'm pretty sure that all of this is horrible racy, but nobody really
> looked into the design from a higher level as far as I know.

Yeah the scheduler thread is fine because it's stopped, but it also
doesn't look at sched->ready, so it can't race. What does race is new
submissions, and if they stuff something into the queue then I'm
wondering what happens to that. Also what happens to the requests
already in the queue.

Eventually I guess userspace notices the ENOENT, tears down the
context, and the kernel then also tears down the context and cleans up
the mess. But it's rather inglorious until it collapses down to a
coherent state again I think.

Or is there something with the scheduler restart flow which is
guaranteed to catch these, and we're maybe just missing a bunch of
barriers?

Either way I think a proper interface to terminally wedge a sched
would be good, so that at least we can pass back something meaningful
like -EIO. And also tell "the gpu died" apart from "the driver author
tore down the scheduler while it was still in use", which I think we
really should catch with some WARN_ON.

Anyway for the immediate issue of "don't break amdgpu" I think I'll
reshuffle the split between job_init and job_arm again, and add a big
comment to job_init that it can fail with ENOENT, and why, and what
kind of interface would be more proper. i915 will need the terminally
wedged flow too so I'll probably have to look into this, but that will
need some proper thought.

Cheers, Daniel


>
> Christian.
>
>
>
> > -Daniel
> >
> >
> >> Regards,
> >> Christian.
> >>
> >>> We BUG_ON() here because in the next patch drm_sched_job_init() will
> >>> be split up, with drm_sched_job_arm() never failing. And that's the
> >>> part where the rq selection will end up in.
> >>>
> >>> Note that if having an empty sched_list set on an entity is indeed a
> >>> valid use-case, we can keep that check in job_init even after the split
> >>> into job_init/arm.
> >>>
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> Cc: "Christian König" <christian.koenig@amd.com>
> >>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>> Cc: Steven Price <steven.price@arm.com>
> >>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> >>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>> ---
> >>>    drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
> >>>    drivers/gpu/drm/scheduler/sched_main.c   | 3 +--
> >>>    2 files changed, 2 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> index 79554aa4dbb1..6fc116ee7302 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> @@ -45,7 +45,7 @@
> >>>     * @guilty: atomic_t set to 1 when a job on this queue
> >>>     *          is found to be guilty causing a timeout
> >>>     *
> >>> - * Note: the sched_list should have at least one element to schedule
> >>> + * Note: the sched_list must have at least one element to schedule
> >>>     *       the entity
> >>>     *
> >>>     * Returns 0 on success or a negative error code on failure.
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 33c414d55fab..01dd47154181 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -586,8 +586,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>        struct drm_gpu_scheduler *sched;
> >>>
> >>>        drm_sched_entity_select_rq(entity);
> >>> -     if (!entity->rq)
> >>> -             return -ENOENT;
> >>> +     BUG_ON(!entity->rq);
> >>>
> >>>        sched = entity->rq->sched;
> >>>
> >
>


--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Christian König July 9, 2021, 8:11 a.m. UTC | #5

Am 09.07.21 um 10:00 schrieb Daniel Vetter:
> On Fri, Jul 9, 2021 at 9:23 AM Christian König <christian.koenig@amd.com> wrote:
>> Am 09.07.21 um 09:14 schrieb Daniel Vetter:
>>> On Fri, Jul 9, 2021 at 8:53 AM Christian König <christian.koenig@amd.com> wrote:
>>>> Am 08.07.21 um 19:37 schrieb Daniel Vetter:
>>>>> If it does, someone managed to set up a sched_entity without
>>>>> schedulers, which is just a driver bug.
>>>> NAK, it is perfectly valid for rq selection to fail.
>>> There isn't a better way to explain stuff to someone who's new to the
>>> code and tries to improve it with docs than to NAK stuff with
>>> incomplete explanations?
>> Well as far as I understand it a NAK means that the author has missed
>> something important and needs to re-iterate.
> It comes around as very screaming at least to me (all uppercase and
> all that) and personally I only associate it with unchecked angry
> kernel maintainers on lkml celebrating their status and putting down
> some noobs for shits and giggles. I think here on dri-devel you're the
> only one doing it regularly.

I've learned a different meaning for this.

A NAK in communication means that something was missing and you should 
re-try. E.g. think about RS-232 ACK/NAK mode.

>> It's just to say that we absolutely can't merge a patch or something
>> will break.
> Well yeah I know that when a patch breaks something I can't merge it.
> For drm-intel we also documented that clearly, but for drm-misc it's
> not spelled out. I'll fix that.
>
>>>> See drm_sched_pick_best():
>>>>
>>>>                    if (!sched->ready) {
>>>>                            DRM_WARN("scheduler %s is not ready, skipping",
>>>>                                     sched->name);
>>>>                            continue;
>>>>                    }
>>>>
>>>> This can happen when a device reset fails for some engine.
>>> Well yeah I didn't expect amdgpu to just change this directly, so I
>>> didn't find it. Getting an ENOENT on a hw failure instead of an EIO is
>>> a bit interesting semantics I guess, also what happens with the jobs
>>> which raced against the scheduler not being ready? I'm not seeing any
>>> checks for ready in the main scheduler logic so this at least looks
>>> somewhat accidental as a side effect, also no other driver than amdgpu
>>> communitcates that reset failed back to drm/sched like this. They seem
>>> to just not, and I guess timeout on the next request will get us into
>>> an endless reset loop?
>> Correct. Key point is that there aren't any jobs which are currently
>> scheduled.
>>
>> When the ready flag is changed the scheduler is paused, e.g. the main
>> thread is not running any more.
>>
>> I'm pretty sure that all of this is horrible racy, but nobody really
>> looked into the design from a higher level as far as I know.
> Yeah the scheduler thread is fine because it's stopped, but it also
> doesn't look at sched->ready, so it can't race. What does race is new
> submissions, and if they stuff something into the queue then I'm
> wondering what happens to that. Also what happens to the requests
> already in the queue.
>
> Eventually I guess userspace notices the ENOENT, tears down the
> context, and the kernel then also tears down the context and cleans up
> the mess. But it's rather inglorious until it collapses down to a
> coherent state again I think.
>
> Or is there something with the scheduler restart flow which is
> guaranteed to catch these, and we're maybe just missing a bunch of
> barriers?

I honestly have no idea. Never looked so deeply into the big picture of 
this.

I've just tried to play fire fighter and stopped people from touching 
the flag during GPU reset when it isn't necessary.

> Either way I think a proper interface to terminally wedge a sched
> would be good, so that at least we can pass back something meaningful
> like -EIO. And also tell "the gpu died" apart from "the driver author
> tore down the scheduler while it was still in use", which I think we
> really should catch with some WARN_ON.
>
> Anyway for the immediate issue of "don't break amdgpu" I think I'll
> reshuffle the split between job_init and job_arm again, and add a big
> comment to job_init that it can fail with ENOENT, and why, and what
> kind of interface would be more proper. i915 will need the terminally
> wedged flow too so I'll probably have to look into this, but that will
> need some proper thought.

Yeah, the functionality is absolutely necessary.

Regards,
Christian.

>
> Cheers, Daniel
>
>
>> Christian.
>>
>>
>>
>>> -Daniel
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> We BUG_ON() here because in the next patch drm_sched_job_init() will
>>>>> be split up, with drm_sched_job_arm() never failing. And that's the
>>>>> part where the rq selection will end up in.
>>>>>
>>>>> Note that if having an empty sched_list set on an entity is indeed a
>>>>> valid use-case, we can keep that check in job_init even after the split
>>>>> into job_init/arm.
>>>>>
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>> Cc: Steven Price <steven.price@arm.com>
>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
>>>>>     drivers/gpu/drm/scheduler/sched_main.c   | 3 +--
>>>>>     2 files changed, 2 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> index 79554aa4dbb1..6fc116ee7302 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> @@ -45,7 +45,7 @@
>>>>>      * @guilty: atomic_t set to 1 when a job on this queue
>>>>>      *          is found to be guilty causing a timeout
>>>>>      *
>>>>> - * Note: the sched_list should have at least one element to schedule
>>>>> + * Note: the sched_list must have at least one element to schedule
>>>>>      *       the entity
>>>>>      *
>>>>>      * Returns 0 on success or a negative error code on failure.
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 33c414d55fab..01dd47154181 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -586,8 +586,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>>>         struct drm_gpu_scheduler *sched;
>>>>>
>>>>>         drm_sched_entity_select_rq(entity);
>>>>> -     if (!entity->rq)
>>>>> -             return -ENOENT;
>>>>> +     BUG_ON(!entity->rq);
>>>>>
>>>>>         sched = entity->rq->sched;
>>>>>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cb9b28b2956e14b4aa4d008d942af9c86%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637614144309261704%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nLQwH3aealjdS0cPWLqSUvwTmx2pQa1%2B5B%2BSKpLY%2BHE%3D&amp;reserved=0

[v3,01/20] drm/sched: entity->rq selection cannot fail

Commit Message

Comments

Patch