Message ID | 1354277580-17958-4-git-send-email-maarten.lankhorst@canonical.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst <m.b.lankhorst@gmail.com> wrote: > This requires re-use of the seqno, which increases fairness slightly. > Instead of spinning with a new seqno every time we keep the current one, > but still drop all other reservations we hold. Only when we succeed, > we try to get back our other reservations again. > > This should increase fairness slightly as well. > > Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> > --- > drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c > index c7d3236..c02b2b6 100644 > --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c > +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c > @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list) > entry = list_first_entry(list, struct ttm_validate_buffer, head); > glob = entry->bo->glob; > > -retry: > spin_lock(&glob->lru_lock); > val_seq = entry->bo->bdev->val_seq++; > > +retry: After more thinking while i was going over patches to send comment i think this one is bad, really bad. So here is bad case i see, we have a list of 2 bo A & B. We succeed reserving A we set seq_valid to true and val_seq to our val_seq. We fail on B, we go slow path and unreserve A. Some other CPU reserve A but before it write either seq_valid or val_seq we wakeup reserve B and retry to reserve A in reserve_nolru we see A as reserved we enter the while loop see seq_valid set and test val_seq against our same old val_seq we got == and return -EDEADLCK which is definitly not what we want to do. If we inc val seq on retry we will never have this case. So what exactly not incrementing it gave us ? I think the read and write memory barrier i was talking in patch 1 should avoid any such things to happen but i need to sleep on that to make nightmare about all things that can go wrong. > list_for_each_entry(entry, list, head) { > struct ttm_buffer_object *bo = entry->bo; > > + /* already slowpath reserved? */ > + if (entry->reserved) > + continue; > + > ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq); > switch (ret) { > case 0: > @@ -157,9 +161,15 @@ retry: > ttm_eu_backoff_reservation_locked(list); > spin_unlock(&glob->lru_lock); > ttm_eu_list_ref_sub(list); > - ret = ttm_bo_wait_unreserved(bo, true); > + ret = ttm_bo_reserve_slowpath_nolru(bo, true, val_seq); > if (unlikely(ret != 0)) > return ret; > + spin_lock(&glob->lru_lock); > + entry->reserved = true; > + if (unlikely(atomic_read(&bo->cpu_writers) > 0)) { > + ret = -EBUSY; > + goto err; > + } > goto retry; > default: > goto err; > -- > 1.8.0 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
Op 06-12-12 02:36, Jerome Glisse schreef: > On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst > <m.b.lankhorst@gmail.com> wrote: >> This requires re-use of the seqno, which increases fairness slightly. >> Instead of spinning with a new seqno every time we keep the current one, >> but still drop all other reservations we hold. Only when we succeed, >> we try to get back our other reservations again. >> >> This should increase fairness slightly as well. >> >> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> >> --- >> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++-- >> 1 file changed, 12 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c >> index c7d3236..c02b2b6 100644 >> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c >> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c >> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list) >> entry = list_first_entry(list, struct ttm_validate_buffer, head); >> glob = entry->bo->glob; >> >> -retry: >> spin_lock(&glob->lru_lock); >> val_seq = entry->bo->bdev->val_seq++; >> >> +retry: > After more thinking while i was going over patches to send comment i > think this one is bad, really bad. So here is bad case i see, we have > a list of 2 bo A & B. We succeed reserving A we set seq_valid to true > and val_seq to our val_seq. We fail on B, we go slow path and > unreserve A. Some other CPU reserve A but before it write either > seq_valid or val_seq we wakeup reserve B and retry to reserve A in > reserve_nolru we see A as reserved we enter the while loop see > seq_valid set and test val_seq against our same old val_seq we got == > and return -EDEADLCK which is definitly not what we want to do. If we > inc val seq on retry we will never have this case. So what exactly not > incrementing it gave us ? Hm, that would indeed be a valid problem. Annoyingly, I can't do the conversion to mutexes until the end of the patchset. I solved this in the ticket mutexes by making the seqno an atomic, which is unset before unlocking. But doing this with the current ttm code would force extra wake_up_all's inside the reserve path, since we lack the slowpath handling mutexes have. Nouveau would run into the same problems already with patch 1, so perhaps we could disable the -EDEADLK handling and sleep in that case for now? Afaict, it's a driver error if -EDEADLK occurs. If performance is unimportant unset seq_valid before unreserving, and add a smp_wmb between seqno and seqno_valid assignments. wake_up_all() would be called unconditionally during reservation then, but that call is expensive.. > I think the read and write memory barrier i was talking in patch 1 > should avoid any such things to happen but i need to sleep on that to > make nightmare about all things that can go wrong. No it wouldn't help, still a race immediately after the xchg call. :-( ~Maarten
On Thu, Dec 6, 2012 at 5:52 AM, Maarten Lankhorst <maarten.lankhorst@canonical.com> wrote: > Op 06-12-12 02:36, Jerome Glisse schreef: >> On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst >> <m.b.lankhorst@gmail.com> wrote: >>> This requires re-use of the seqno, which increases fairness slightly. >>> Instead of spinning with a new seqno every time we keep the current one, >>> but still drop all other reservations we hold. Only when we succeed, >>> we try to get back our other reservations again. >>> >>> This should increase fairness slightly as well. >>> >>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> >>> --- >>> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++-- >>> 1 file changed, 12 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c >>> index c7d3236..c02b2b6 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c >>> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c >>> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list) >>> entry = list_first_entry(list, struct ttm_validate_buffer, head); >>> glob = entry->bo->glob; >>> >>> -retry: >>> spin_lock(&glob->lru_lock); >>> val_seq = entry->bo->bdev->val_seq++; >>> >>> +retry: >> After more thinking while i was going over patches to send comment i >> think this one is bad, really bad. So here is bad case i see, we have >> a list of 2 bo A & B. We succeed reserving A we set seq_valid to true >> and val_seq to our val_seq. We fail on B, we go slow path and >> unreserve A. Some other CPU reserve A but before it write either >> seq_valid or val_seq we wakeup reserve B and retry to reserve A in >> reserve_nolru we see A as reserved we enter the while loop see >> seq_valid set and test val_seq against our same old val_seq we got == >> and return -EDEADLCK which is definitly not what we want to do. If we >> inc val seq on retry we will never have this case. So what exactly not >> incrementing it gave us ? > Hm, that would indeed be a valid problem. > > Annoyingly, I can't do the conversion to mutexes until the end of the patchset. > I solved this in the ticket mutexes by making the seqno an atomic, which is unset before unlocking. > But doing this with the current ttm code would force extra wake_up_all's inside the reserve path, > since we lack the slowpath handling mutexes have. > > Nouveau would run into the same problems already with patch 1, so perhaps we could disable > the -EDEADLK handling and sleep in that case for now? Afaict, it's a driver error if -EDEADLK > occurs. > > If performance is unimportant unset seq_valid before unreserving, > and add a smp_wmb between seqno and seqno_valid assignments. > wake_up_all() would be called unconditionally during reservation then, but that call is expensive.. > >> I think the read and write memory barrier i was talking in patch 1 >> should avoid any such things to happen but i need to sleep on that to >> make nightmare about all things that can go wrong. > No it wouldn't help, still a race immediately after the xchg call. :-( > > ~Maarten Then just keep the retry label where it's, it doesn't change the fundamental of that patch and it avoids the EDEADLK scenario. Cheers, Jerome
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c index c7d3236..c02b2b6 100644 --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list) entry = list_first_entry(list, struct ttm_validate_buffer, head); glob = entry->bo->glob; -retry: spin_lock(&glob->lru_lock); val_seq = entry->bo->bdev->val_seq++; +retry: list_for_each_entry(entry, list, head) { struct ttm_buffer_object *bo = entry->bo; + /* already slowpath reserved? */ + if (entry->reserved) + continue; + ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq); switch (ret) { case 0: @@ -157,9 +161,15 @@ retry: ttm_eu_backoff_reservation_locked(list); spin_unlock(&glob->lru_lock); ttm_eu_list_ref_sub(list); - ret = ttm_bo_wait_unreserved(bo, true); + ret = ttm_bo_reserve_slowpath_nolru(bo, true, val_seq); if (unlikely(ret != 0)) return ret; + spin_lock(&glob->lru_lock); + entry->reserved = true; + if (unlikely(atomic_read(&bo->cpu_writers) > 0)) { + ret = -EBUSY; + goto err; + } goto retry; default: goto err;
This requires re-use of the seqno, which increases fairness slightly. Instead of spinning with a new seqno every time we keep the current one, but still drop all other reservations we hold. Only when we succeed, we try to get back our other reservations again. This should increase fairness slightly as well. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> --- drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)