diff mbox

[4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers

Message ID 1354277580-17958-4-git-send-email-maarten.lankhorst@canonical.com (mailing list archive)
State New, archived
Headers show

Commit Message

Maarten Lankhorst Nov. 30, 2012, 12:12 p.m. UTC
This requires re-use of the seqno, which increases fairness slightly.
Instead of spinning with a new seqno every time we keep the current one,
but still drop all other reservations we hold. Only when we succeed,
we try to get back our other reservations again.

This should increase fairness slightly as well.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
 drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

Comments

Jerome Glisse Dec. 6, 2012, 1:36 a.m. UTC | #1
On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
<m.b.lankhorst@gmail.com> wrote:
> This requires re-use of the seqno, which increases fairness slightly.
> Instead of spinning with a new seqno every time we keep the current one,
> but still drop all other reservations we hold. Only when we succeed,
> we try to get back our other reservations again.
>
> This should increase fairness slightly as well.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
>  drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index c7d3236..c02b2b6 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list)
>         entry = list_first_entry(list, struct ttm_validate_buffer, head);
>         glob = entry->bo->glob;
>
> -retry:
>         spin_lock(&glob->lru_lock);
>         val_seq = entry->bo->bdev->val_seq++;
>
> +retry:

After more thinking while i was going over patches to send comment i
think this one is bad, really bad. So here is bad case i see, we have
a list of 2 bo A & B. We succeed reserving A we set seq_valid to true
and val_seq to our val_seq. We fail on B, we go slow path and
unreserve A. Some other CPU reserve A but before it write either
seq_valid or val_seq we wakeup reserve B and retry to reserve A in
reserve_nolru we see A as reserved we enter the while loop see
seq_valid set and test val_seq against our same old val_seq we got ==
and return -EDEADLCK which is definitly not what we want to do. If we
inc val seq on retry we will never have this case. So what exactly not
incrementing it gave us ?

I think the read and write memory barrier i was talking in patch 1
should avoid any such things to happen but i need to sleep on that to
make nightmare about all things that can go wrong.

>         list_for_each_entry(entry, list, head) {
>                 struct ttm_buffer_object *bo = entry->bo;
>
> +               /* already slowpath reserved? */
> +               if (entry->reserved)
> +                       continue;
> +
>                 ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq);
>                 switch (ret) {
>                 case 0:
> @@ -157,9 +161,15 @@ retry:
>                         ttm_eu_backoff_reservation_locked(list);
>                         spin_unlock(&glob->lru_lock);
>                         ttm_eu_list_ref_sub(list);
> -                       ret = ttm_bo_wait_unreserved(bo, true);
> +                       ret = ttm_bo_reserve_slowpath_nolru(bo, true, val_seq);
>                         if (unlikely(ret != 0))
>                                 return ret;
> +                       spin_lock(&glob->lru_lock);
> +                       entry->reserved = true;
> +                       if (unlikely(atomic_read(&bo->cpu_writers) > 0)) {
> +                               ret = -EBUSY;
> +                               goto err;
> +                       }
>                         goto retry;
>                 default:
>                         goto err;
> --
> 1.8.0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Maarten Lankhorst Dec. 6, 2012, 10:52 a.m. UTC | #2
Op 06-12-12 02:36, Jerome Glisse schreef:
> On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
> <m.b.lankhorst@gmail.com> wrote:
>> This requires re-use of the seqno, which increases fairness slightly.
>> Instead of spinning with a new seqno every time we keep the current one,
>> but still drop all other reservations we hold. Only when we succeed,
>> we try to get back our other reservations again.
>>
>> This should increase fairness slightly as well.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> ---
>>  drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>> index c7d3236..c02b2b6 100644
>> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list)
>>         entry = list_first_entry(list, struct ttm_validate_buffer, head);
>>         glob = entry->bo->glob;
>>
>> -retry:
>>         spin_lock(&glob->lru_lock);
>>         val_seq = entry->bo->bdev->val_seq++;
>>
>> +retry:
> After more thinking while i was going over patches to send comment i
> think this one is bad, really bad. So here is bad case i see, we have
> a list of 2 bo A & B. We succeed reserving A we set seq_valid to true
> and val_seq to our val_seq. We fail on B, we go slow path and
> unreserve A. Some other CPU reserve A but before it write either
> seq_valid or val_seq we wakeup reserve B and retry to reserve A in
> reserve_nolru we see A as reserved we enter the while loop see
> seq_valid set and test val_seq against our same old val_seq we got ==
> and return -EDEADLCK which is definitly not what we want to do. If we
> inc val seq on retry we will never have this case. So what exactly not
> incrementing it gave us ?
Hm, that would indeed be a valid problem.

Annoyingly, I can't do the conversion to mutexes until the end of the patchset.
I solved this in the ticket mutexes by making the seqno an atomic, which is unset before unlocking.
But doing this with the current ttm code would force extra wake_up_all's inside the reserve path,
since we lack the slowpath handling mutexes have.

Nouveau would run into the same problems already with patch 1, so perhaps we could disable
the -EDEADLK handling and sleep in that case for now? Afaict, it's a driver error if -EDEADLK
occurs.

If performance is unimportant unset seq_valid before unreserving,
and add a smp_wmb between seqno and seqno_valid assignments.
wake_up_all() would be called unconditionally during reservation then, but that call is expensive..

> I think the read and write memory barrier i was talking in patch 1
> should avoid any such things to happen but i need to sleep on that to
> make nightmare about all things that can go wrong.
No it wouldn't help, still a race immediately after the xchg call. :-(

~Maarten
Jerome Glisse Dec. 6, 2012, 3:38 p.m. UTC | #3
On Thu, Dec 6, 2012 at 5:52 AM, Maarten Lankhorst
<maarten.lankhorst@canonical.com> wrote:
> Op 06-12-12 02:36, Jerome Glisse schreef:
>> On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
>> <m.b.lankhorst@gmail.com> wrote:
>>> This requires re-use of the seqno, which increases fairness slightly.
>>> Instead of spinning with a new seqno every time we keep the current one,
>>> but still drop all other reservations we hold. Only when we succeed,
>>> we try to get back our other reservations again.
>>>
>>> This should increase fairness slightly as well.
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> ---
>>>  drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
>>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> index c7d3236..c02b2b6 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list)
>>>         entry = list_first_entry(list, struct ttm_validate_buffer, head);
>>>         glob = entry->bo->glob;
>>>
>>> -retry:
>>>         spin_lock(&glob->lru_lock);
>>>         val_seq = entry->bo->bdev->val_seq++;
>>>
>>> +retry:
>> After more thinking while i was going over patches to send comment i
>> think this one is bad, really bad. So here is bad case i see, we have
>> a list of 2 bo A & B. We succeed reserving A we set seq_valid to true
>> and val_seq to our val_seq. We fail on B, we go slow path and
>> unreserve A. Some other CPU reserve A but before it write either
>> seq_valid or val_seq we wakeup reserve B and retry to reserve A in
>> reserve_nolru we see A as reserved we enter the while loop see
>> seq_valid set and test val_seq against our same old val_seq we got ==
>> and return -EDEADLCK which is definitly not what we want to do. If we
>> inc val seq on retry we will never have this case. So what exactly not
>> incrementing it gave us ?
> Hm, that would indeed be a valid problem.
>
> Annoyingly, I can't do the conversion to mutexes until the end of the patchset.
> I solved this in the ticket mutexes by making the seqno an atomic, which is unset before unlocking.
> But doing this with the current ttm code would force extra wake_up_all's inside the reserve path,
> since we lack the slowpath handling mutexes have.
>
> Nouveau would run into the same problems already with patch 1, so perhaps we could disable
> the -EDEADLK handling and sleep in that case for now? Afaict, it's a driver error if -EDEADLK
> occurs.
>
> If performance is unimportant unset seq_valid before unreserving,
> and add a smp_wmb between seqno and seqno_valid assignments.
> wake_up_all() would be called unconditionally during reservation then, but that call is expensive..
>
>> I think the read and write memory barrier i was talking in patch 1
>> should avoid any such things to happen but i need to sleep on that to
>> make nightmare about all things that can go wrong.
> No it wouldn't help, still a race immediately after the xchg call. :-(
>
> ~Maarten

Then just keep the retry label where it's, it doesn't change the
fundamental of that patch and it avoids the EDEADLK scenario.

Cheers,
Jerome
diff mbox

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index c7d3236..c02b2b6 100644
--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -129,13 +129,17 @@  int ttm_eu_reserve_buffers(struct list_head *list)
 	entry = list_first_entry(list, struct ttm_validate_buffer, head);
 	glob = entry->bo->glob;
 
-retry:
 	spin_lock(&glob->lru_lock);
 	val_seq = entry->bo->bdev->val_seq++;
 
+retry:
 	list_for_each_entry(entry, list, head) {
 		struct ttm_buffer_object *bo = entry->bo;
 
+		/* already slowpath reserved? */
+		if (entry->reserved)
+			continue;
+
 		ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq);
 		switch (ret) {
 		case 0:
@@ -157,9 +161,15 @@  retry:
 			ttm_eu_backoff_reservation_locked(list);
 			spin_unlock(&glob->lru_lock);
 			ttm_eu_list_ref_sub(list);
-			ret = ttm_bo_wait_unreserved(bo, true);
+			ret = ttm_bo_reserve_slowpath_nolru(bo, true, val_seq);
 			if (unlikely(ret != 0))
 				return ret;
+			spin_lock(&glob->lru_lock);
+			entry->reserved = true;
+			if (unlikely(atomic_read(&bo->cpu_writers) > 0)) {
+				ret = -EBUSY;
+				goto err;
+			}
 			goto retry;
 		default:
 			goto err;