diff mbox series

[06/16] drm/i915/gt: Decouple completed requests on unwind

Message ID 20201124114219.29020-6-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [01/16] drm/i915/gem: Drop free_work for GEM contexts | expand

Commit Message

Chris Wilson Nov. 24, 2020, 11:42 a.m. UTC
Since the introduction of preempt-to-busy, requests can complete in the
background, even while they are not on the engine->active.requests list.
As such, the engine->active.request list itself is not in strict
retirement order, and we have to scan the entire list while unwinding to
not miss any. However, if the request is completed we currently leave it
on the list [until retirement], but we could just as simply remove it
and stop treating it as active. We would only have to then traverse it
once while unwinding in quick succession.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
 drivers/gpu/drm/i915/i915_request.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

Comments

Tvrtko Ursulin Nov. 24, 2020, 5:13 p.m. UTC | #1
On 24/11/2020 11:42, Chris Wilson wrote:
> Since the introduction of preempt-to-busy, requests can complete in the
> background, even while they are not on the engine->active.requests list.
> As such, the engine->active.request list itself is not in strict
> retirement order, and we have to scan the entire list while unwinding to
> not miss any. However, if the request is completed we currently leave it
> on the list [until retirement], but we could just as simply remove it
> and stop treating it as active. We would only have to then traverse it
> once while unwinding in quick succession.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
>   drivers/gpu/drm/i915/i915_request.c | 3 ++-
>   2 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index 30aa59fb7271..cf11cbac241b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>   	list_for_each_entry_safe_reverse(rq, rn,
>   					 &engine->active.requests,
>   					 sched.link) {
> -		if (i915_request_completed(rq))
> -			continue; /* XXX */
> +		if (i915_request_completed(rq)) {
> +			list_del_init(&rq->sched.link);
> +			continue;
> +		}
>   
>   		__i915_request_unsubmit(rq);
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 8d7d29c9e375..a9db1376b996 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -321,7 +321,8 @@ bool i915_request_retire(struct i915_request *rq)
>   	 * after removing the breadcrumb and signaling it, so that we do not
>   	 * inadvertently attach the breadcrumb to a completed request.
>   	 */
> -	remove_from_engine(rq);
> +	if (!list_empty(&rq->sched.link))
> +		remove_from_engine(rq);

The list_empty check is unlocked so is list_del_init in 
remove_from_engine safe on potentially already unlinked request or it 
needs to re-check under the lock?

Regards,

Tvrtko

>   	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
>   
>   	__list_del_entry(&rq->link); /* poison neither prev/next (RCU walks) */
>
Chris Wilson Nov. 24, 2020, 5:31 p.m. UTC | #2
Quoting Tvrtko Ursulin (2020-11-24 17:13:02)
> 
> On 24/11/2020 11:42, Chris Wilson wrote:
> > Since the introduction of preempt-to-busy, requests can complete in the
> > background, even while they are not on the engine->active.requests list.
> > As such, the engine->active.request list itself is not in strict
> > retirement order, and we have to scan the entire list while unwinding to
> > not miss any. However, if the request is completed we currently leave it
> > on the list [until retirement], but we could just as simply remove it
> > and stop treating it as active. We would only have to then traverse it
> > once while unwinding in quick succession.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
> >   drivers/gpu/drm/i915/i915_request.c | 3 ++-
> >   2 files changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 30aa59fb7271..cf11cbac241b 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
> >       list_for_each_entry_safe_reverse(rq, rn,
> >                                        &engine->active.requests,
> >                                        sched.link) {
> > -             if (i915_request_completed(rq))
> > -                     continue; /* XXX */
> > +             if (i915_request_completed(rq)) {
> > +                     list_del_init(&rq->sched.link);
> > +                     continue;
> > +             }
> >   
> >               __i915_request_unsubmit(rq);
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 8d7d29c9e375..a9db1376b996 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -321,7 +321,8 @@ bool i915_request_retire(struct i915_request *rq)
> >        * after removing the breadcrumb and signaling it, so that we do not
> >        * inadvertently attach the breadcrumb to a completed request.
> >        */
> > -     remove_from_engine(rq);
> > +     if (!list_empty(&rq->sched.link))
> > +             remove_from_engine(rq);
> 
> The list_empty check is unlocked so is list_del_init in 
> remove_from_engine safe on potentially already unlinked request or it 
> needs to re-check under the lock?

It's safe. The unwind is under the lock, and remove_from_engine takes
the lock, and both do list_del_init() which is a no-op if already
removed. And the end state of the flag bits is the same on each path. We
can skip the __notify_execute_cb_imm() since we know in unwind it is
executing and there should be no cb.

The test before we take the lock is only allowed to skip the active.lock
if it sees the list is already decoupled, in which case we can leave it
to the unwind to remove it from the engine (and we know that the request
can only have been inflight prior to completion). Since the test is not
locked, we don't serialise with the removal, but the list_del_init is
the last action on the request so there is no window where the unwind is
accessing the request after it may have been retired.

list_move() will not confuse list_empty(), as although it does a
list_del_entry, it is not temporarily re-initialised to an empty list.
-Chris
Tvrtko Ursulin Nov. 25, 2020, 9:15 a.m. UTC | #3
On 24/11/2020 17:31, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-11-24 17:13:02)
>>
>> On 24/11/2020 11:42, Chris Wilson wrote:
>>> Since the introduction of preempt-to-busy, requests can complete in the
>>> background, even while they are not on the engine->active.requests list.
>>> As such, the engine->active.request list itself is not in strict
>>> retirement order, and we have to scan the entire list while unwinding to
>>> not miss any. However, if the request is completed we currently leave it
>>> on the list [until retirement], but we could just as simply remove it
>>> and stop treating it as active. We would only have to then traverse it
>>> once while unwinding in quick succession.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
>>>    drivers/gpu/drm/i915/i915_request.c | 3 ++-
>>>    2 files changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> index 30aa59fb7271..cf11cbac241b 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>> @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>>>        list_for_each_entry_safe_reverse(rq, rn,
>>>                                         &engine->active.requests,
>>>                                         sched.link) {
>>> -             if (i915_request_completed(rq))
>>> -                     continue; /* XXX */
>>> +             if (i915_request_completed(rq)) {
>>> +                     list_del_init(&rq->sched.link);
>>> +                     continue;
>>> +             }
>>>    
>>>                __i915_request_unsubmit(rq);
>>>    
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 8d7d29c9e375..a9db1376b996 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -321,7 +321,8 @@ bool i915_request_retire(struct i915_request *rq)
>>>         * after removing the breadcrumb and signaling it, so that we do not
>>>         * inadvertently attach the breadcrumb to a completed request.
>>>         */
>>> -     remove_from_engine(rq);
>>> +     if (!list_empty(&rq->sched.link))
>>> +             remove_from_engine(rq);
>>
>> The list_empty check is unlocked so is list_del_init in
>> remove_from_engine safe on potentially already unlinked request or it
>> needs to re-check under the lock?
> 
> It's safe. The unwind is under the lock, and remove_from_engine takes
> the lock, and both do list_del_init() which is a no-op if already
> removed. And the end state of the flag bits is the same on each path. We
> can skip the __notify_execute_cb_imm() since we know in unwind it is
> executing and there should be no cb.
> 
> The test before we take the lock is only allowed to skip the active.lock
> if it sees the list is already decoupled, in which case we can leave it
> to the unwind to remove it from the engine (and we know that the request
> can only have been inflight prior to completion). Since the test is not
> locked, we don't serialise with the removal, but the list_del_init is
> the last action on the request so there is no window where the unwind is
> accessing the request after it may have been retired.
> 
> list_move() will not confuse list_empty(), as although it does a
> list_del_entry, it is not temporarily re-initialised to an empty list.

List_del_init is indeed safe. List_move.. which one you think can race 
with retire? Preempt-to-busy unwinding an almost completed request yet 
again? Or even preempt timeout racing with completion?

Regards,

Tvrtko
Chris Wilson Nov. 25, 2020, 10:21 a.m. UTC | #4
Quoting Tvrtko Ursulin (2020-11-25 09:15:25)
> 
> On 24/11/2020 17:31, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2020-11-24 17:13:02)
> >>
> >> On 24/11/2020 11:42, Chris Wilson wrote:
> >>> Since the introduction of preempt-to-busy, requests can complete in the
> >>> background, even while they are not on the engine->active.requests list.
> >>> As such, the engine->active.request list itself is not in strict
> >>> retirement order, and we have to scan the entire list while unwinding to
> >>> not miss any. However, if the request is completed we currently leave it
> >>> on the list [until retirement], but we could just as simply remove it
> >>> and stop treating it as active. We would only have to then traverse it
> >>> once while unwinding in quick succession.
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
> >>>    drivers/gpu/drm/i915/i915_request.c | 3 ++-
> >>>    2 files changed, 6 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> index 30aa59fb7271..cf11cbac241b 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>> @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
> >>>        list_for_each_entry_safe_reverse(rq, rn,
> >>>                                         &engine->active.requests,
> >>>                                         sched.link) {
> >>> -             if (i915_request_completed(rq))
> >>> -                     continue; /* XXX */
> >>> +             if (i915_request_completed(rq)) {
> >>> +                     list_del_init(&rq->sched.link);
> >>> +                     continue;
> >>> +             }
> >>>    
> >>>                __i915_request_unsubmit(rq);
> >>>    
> >>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>> index 8d7d29c9e375..a9db1376b996 100644
> >>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>> @@ -321,7 +321,8 @@ bool i915_request_retire(struct i915_request *rq)
> >>>         * after removing the breadcrumb and signaling it, so that we do not
> >>>         * inadvertently attach the breadcrumb to a completed request.
> >>>         */
> >>> -     remove_from_engine(rq);
> >>> +     if (!list_empty(&rq->sched.link))
> >>> +             remove_from_engine(rq);
> >>
> >> The list_empty check is unlocked so is list_del_init in
> >> remove_from_engine safe on potentially already unlinked request or it
> >> needs to re-check under the lock?
> > 
> > It's safe. The unwind is under the lock, and remove_from_engine takes
> > the lock, and both do list_del_init() which is a no-op if already
> > removed. And the end state of the flag bits is the same on each path. We
> > can skip the __notify_execute_cb_imm() since we know in unwind it is
> > executing and there should be no cb.
> > 
> > The test before we take the lock is only allowed to skip the active.lock
> > if it sees the list is already decoupled, in which case we can leave it
> > to the unwind to remove it from the engine (and we know that the request
> > can only have been inflight prior to completion). Since the test is not
> > locked, we don't serialise with the removal, but the list_del_init is
> > the last action on the request so there is no window where the unwind is
> > accessing the request after it may have been retired.
> > 
> > list_move() will not confuse list_empty(), as although it does a
> > list_del_entry, it is not temporarily re-initialised to an empty list.
> 
> List_del_init is indeed safe. List_move.. which one you think can race 
> with retire? Preempt-to-busy unwinding an almost completed request yet 
> again? Or even preempt timeout racing with completion?

Here in unwind. We pass the completion check, but the request may still
be running and complete at any time (until we submit & ack the new ELSP).
So an unlocked list_empty check during retire can race with any of the
list_move during unwind and resubmit. (On resubmit, we check completed
under the lock and drop the request in __i915_request_submit which
should also leave it in a consistent state as if we had called
remove_from_engine.)
-Chris
Tvrtko Ursulin Nov. 25, 2020, 4:21 p.m. UTC | #5
On 25/11/2020 10:21, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-11-25 09:15:25)
>>
>> On 24/11/2020 17:31, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-11-24 17:13:02)
>>>>
>>>> On 24/11/2020 11:42, Chris Wilson wrote:
>>>>> Since the introduction of preempt-to-busy, requests can complete in the
>>>>> background, even while they are not on the engine->active.requests list.
>>>>> As such, the engine->active.request list itself is not in strict
>>>>> retirement order, and we have to scan the entire list while unwinding to
>>>>> not miss any. However, if the request is completed we currently leave it
>>>>> on the list [until retirement], but we could just as simply remove it
>>>>> and stop treating it as active. We would only have to then traverse it
>>>>> once while unwinding in quick succession.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
>>>>>     drivers/gpu/drm/i915/i915_request.c | 3 ++-
>>>>>     2 files changed, 6 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> index 30aa59fb7271..cf11cbac241b 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>>>>>         list_for_each_entry_safe_reverse(rq, rn,
>>>>>                                          &engine->active.requests,
>>>>>                                          sched.link) {
>>>>> -             if (i915_request_completed(rq))
>>>>> -                     continue; /* XXX */
>>>>> +             if (i915_request_completed(rq)) {
>>>>> +                     list_del_init(&rq->sched.link);
>>>>> +                     continue;
>>>>> +             }
>>>>>     
>>>>>                 __i915_request_unsubmit(rq);
>>>>>     
>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>>> index 8d7d29c9e375..a9db1376b996 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>> @@ -321,7 +321,8 @@ bool i915_request_retire(struct i915_request *rq)
>>>>>          * after removing the breadcrumb and signaling it, so that we do not
>>>>>          * inadvertently attach the breadcrumb to a completed request.
>>>>>          */
>>>>> -     remove_from_engine(rq);
>>>>> +     if (!list_empty(&rq->sched.link))
>>>>> +             remove_from_engine(rq);
>>>>
>>>> The list_empty check is unlocked so is list_del_init in
>>>> remove_from_engine safe on potentially already unlinked request or it
>>>> needs to re-check under the lock?
>>>
>>> It's safe. The unwind is under the lock, and remove_from_engine takes
>>> the lock, and both do list_del_init() which is a no-op if already
>>> removed. And the end state of the flag bits is the same on each path. We
>>> can skip the __notify_execute_cb_imm() since we know in unwind it is
>>> executing and there should be no cb.
>>>
>>> The test before we take the lock is only allowed to skip the active.lock
>>> if it sees the list is already decoupled, in which case we can leave it
>>> to the unwind to remove it from the engine (and we know that the request
>>> can only have been inflight prior to completion). Since the test is not
>>> locked, we don't serialise with the removal, but the list_del_init is
>>> the last action on the request so there is no window where the unwind is
>>> accessing the request after it may have been retired.
>>>
>>> list_move() will not confuse list_empty(), as although it does a
>>> list_del_entry, it is not temporarily re-initialised to an empty list.
>>
>> List_del_init is indeed safe. List_move.. which one you think can race
>> with retire? Preempt-to-busy unwinding an almost completed request yet
>> again? Or even preempt timeout racing with completion?
> 
> Here in unwind. We pass the completion check, but the request may still
> be running and complete at any time (until we submit & ack the new ELSP).
> So an unlocked list_empty check during retire can race with any of the
> list_move during unwind and resubmit. (On resubmit, we check completed
> under the lock and drop the request in __i915_request_submit which
> should also leave it in a consistent state as if we had called
> remove_from_engine.)

Right, yes, that seems safe as well. Only new problem could have been a 
false negative, meaning remote_from_engine _not_ scheduled by mistake if 
a transient false list_empty condition.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 30aa59fb7271..cf11cbac241b 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1116,8 +1116,10 @@  __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	list_for_each_entry_safe_reverse(rq, rn,
 					 &engine->active.requests,
 					 sched.link) {
-		if (i915_request_completed(rq))
-			continue; /* XXX */
+		if (i915_request_completed(rq)) {
+			list_del_init(&rq->sched.link);
+			continue;
+		}
 
 		__i915_request_unsubmit(rq);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 8d7d29c9e375..a9db1376b996 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -321,7 +321,8 @@  bool i915_request_retire(struct i915_request *rq)
 	 * after removing the breadcrumb and signaling it, so that we do not
 	 * inadvertently attach the breadcrumb to a completed request.
 	 */
-	remove_from_engine(rq);
+	if (!list_empty(&rq->sched.link))
+		remove_from_engine(rq);
 	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
 
 	__list_del_entry(&rq->link); /* poison neither prev/next (RCU walks) */