From patchwork Mon Oct 15 18:34:17 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 1595021 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork2.kernel.org (Postfix) with ESMTP id D98E2DFB34 for ; Mon, 15 Oct 2012 18:34:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 53E339F71D for ; Mon, 15 Oct 2012 11:34:54 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by gabe.freedesktop.org (Postfix) with ESMTP id 4FC9B9F713 for ; Mon, 15 Oct 2012 11:34:19 -0700 (PDT) Received: from 5ed48cef.cm-7-5c.dynamic.ziggo.nl ([94.212.140.239] helo=[192.168.1.128]) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1TNpUX-0000dG-Mq; Mon, 15 Oct 2012 18:34:17 +0000 Message-ID: <507C5729.1080003@canonical.com> Date: Mon, 15 Oct 2012 20:34:17 +0200 From: Maarten Lankhorst User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121011 Thunderbird/16.0.1 MIME-Version: 1.0 To: Thomas Hellstrom Subject: Re: buggy/weird behavior in ttm References: <5076DCAD.6070606@canonical.com> <5076FA78.9090507@vmware.com> <50771307.3080807@canonical.com> <50771D50.5000104@vmware.com> <50773251.9060205@canonical.com> <5077B15D.6090509@vmware.com> <5077EC3D.5010002@canonical.com> <507C0134.3040904@vmware.com> <507C2DA5.9020900@canonical.com> In-Reply-To: <507C2DA5.9020900@canonical.com> Cc: dri-devel@lists.freedesktop.org X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Op 15-10-12 17:37, Maarten Lankhorst schreef: >>> To make multi-object reservation work, the fix is to add a ticket "lock" of which all the >>> reservation objects are a nested lock of. Since in this case the ticket lock would prevent >>> deadlocks, this is acceptable. Having 2 ticket 'locks' at the same time would count as >>> deadlock, rightfully. If you hold a reservation from a ticket, then try to reserve without >>> a ticket, it counts as deadlock too. See below for some examples I was using to test. >> But if a ticket lock can be used to abstract a locking sequence that we know will never deadlock, >> why can't we abstract locking from a list with atomic list removal with another "lock", and make sure we get the order right between them? > No, see the test below, lockdep won't be fooled by your diversions that easily!! :-) > > However, if the following rules are valid, life becomes a lot easier: > 1. items on the ddestroy list are not allowed to have a new sync object attached, only > reservations for destruction of this object are allowed. > 2. the only valid thing at this point left is unmapping from vm space and destruction > of buffer backing in memory or vram > 3. All reservations outside of ttm_bo_cleanup_refs(or_queue) are forbidden. > (XXX: check if an exception needed for drivers to accomplish this destruction?) > > Adding a WARN_ON(!list_empty(&bo->ddestroy)); to ttm_bo_reserve and ttm_eu_reserve_buffers > should be enough to catch violators of the above rules. > > If those rules hold, destruction becomes a lot more straightforward. > ttm_bo_swapout and ttm_mem_evict_first can both do the reserve with lru_lock held, > which will be guaranteed to succeed, and the buffer is also guaranteed to be > on the ddestroy list still since we haven't dropped the lock yet. > > So with a reservation, lru_lock held, and entry on the ddestroy list still alive: > > Do a trywait on the bo, if it returns -EBUSY, and no_wait_gpu is unset, > get a reference to bo->sync_obj. unreserve. > > If -EBUSY was returned and no_wait_gpu is set, unlock lru_lock and return error. > > If -EBUSY and no_wait_gpu was not set, unlock lru_lock, do a wait on our copy of > bo->sync_obj, and unref it, return if wait fails. If wait hasn't failed, retake > lru_lock. > > Check if the if the ddestroy entry is gone. If so, someone else was faster, > return 0. If not simply remove bo from all lists without reservation, > unref bo->sync_obj**, and perform the remaining cleanup. > > As far as I can see, this wouldn't leave open a race even if 2 threads do the same, > although the second thread might return 0 to signal success before the backing is > gone, but this can also happen right now. It's even harmless, since in those cases > the functions calling these functions would simply call them one more time, and this > time the destroyed buffer would definitely not be on the swap/lru lists any more. > > It would also keep current behavior the same as far as I can tell, where multiple > waiters could wait for the same bo to be destroyed. > > **) Yes strictly speaking a violation of fence rules, but in this case the buffer > already dropped to 0 references, and there's a BUG_ON in ttm_bo_release_list that > would get triggered otherwise. Can alternatively be fixed by doing a BUG_ON in > ttm_bo_release_list if there is a sync_obj that's not in the signaled state. Attached 3 followup patches to show what I mean, they're based on my tree, which means with all patches applied that I posted on friday. This is not thoroughly tested, but I think it should work. First is fixing radeon not to crash on calling move_notify from release_list. Second moves the memory cleanup to ttm_bo_cleanup_memtype_use, which is more readable and a more logical place to put it. Third cleans up how ttm_bo_cleanup_refs is called. Also available on http://cgit.freedesktop.org/~mlankhorst/linux/log/?h=v10-wip If this looks ok I'll send those below out when the rest of the patches I posted on friday are reviewed. :-) ====== drm/radeon: allow move_notify to be called without reservation The few places that care should have those checks instead. This allow destruction of bo backed memory without a reservation. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/radeon/radeon_gart.c | 1 - drivers/gpu/drm/radeon/radeon_object.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c index 0ee5e29..701b215 100644 --- a/drivers/gpu/drm/radeon/radeon_gart.c +++ b/drivers/gpu/drm/radeon/radeon_gart.c @@ -1044,7 +1044,6 @@ void radeon_vm_bo_invalidate(struct radeon_device *rdev, { struct radeon_bo_va *bo_va; - BUG_ON(!radeon_bo_is_reserved(bo)); list_for_each_entry(bo_va, &bo->va, bo_list) { bo_va->valid = false; } diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index 5b959b6..fa954d7 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -539,7 +539,7 @@ void radeon_bo_get_tiling_flags(struct radeon_bo *bo, int radeon_bo_check_tiling(struct radeon_bo *bo, bool has_moved, bool force_drop) { - BUG_ON(!radeon_bo_is_reserved(bo)); + BUG_ON(!radeon_bo_is_reserved(bo) && !force_drop); if (!(bo->tiling_flags & RADEON_TILING_SURFACE)) return 0; ====== drm/ttm: remove ttm_bo_cleanup_memtype_use move to release_list instead Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/ttm/ttm_bo.c | 45 ++++++++++++------------------------------ 1 file changed, 13 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index dd6a3e6..9b95e96 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -146,8 +146,16 @@ static void ttm_bo_release_list(struct kref *list_kref) BUG_ON(!list_empty(&bo->lru)); BUG_ON(!list_empty(&bo->ddestroy)); - if (bo->ttm) + if (bo->bdev->driver->move_notify) + bo->bdev->driver->move_notify(bo, NULL); + + if (bo->ttm) { + ttm_tt_unbind(bo->ttm); ttm_tt_destroy(bo->ttm); + bo->ttm = NULL; + } + ttm_bo_mem_put(bo, &bo->mem); + atomic_dec(&bo->glob->bo_count); if (bo->destroy) bo->destroy(bo); @@ -465,35 +473,6 @@ out_err: return ret; } -/** - * Call bo::reserved. - * Will release GPU memory type usage on destruction. - * This is the place to put in driver specific hooks to release - * driver private resources. - * Will release the bo::reserved lock. - */ - -static void ttm_bo_cleanup_memtype_use(struct ttm_buffer_object *bo) -{ - if (bo->bdev->driver->move_notify) - bo->bdev->driver->move_notify(bo, NULL); - - if (bo->ttm) { - ttm_tt_unbind(bo->ttm); - ttm_tt_destroy(bo->ttm); - bo->ttm = NULL; - } - ttm_bo_mem_put(bo, &bo->mem); - - atomic_set(&bo->reserved, 0); - - /* - * Make processes trying to reserve really pick it up. - */ - smp_mb__after_atomic_dec(); - wake_up_all(&bo->event_queue); -} - static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo) { struct ttm_bo_device *bdev = bo->bdev; @@ -517,8 +496,9 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo) put_count = ttm_bo_del_from_lru(bo); + atomic_set(&bo->reserved, 0); + wake_up_all(&bo->event_queue); spin_unlock(&glob->lru_lock); - ttm_bo_cleanup_memtype_use(bo); ttm_bo_list_ref_sub(bo, put_count, true); @@ -608,8 +588,9 @@ retry: list_del_init(&bo->ddestroy); ++put_count; + atomic_set(&bo->reserved, 0); + wake_up_all(&bo->event_queue); spin_unlock(&glob->lru_lock); - ttm_bo_cleanup_memtype_use(bo); ttm_bo_list_ref_sub(bo, put_count, true); ====== drm/ttm: add sense to ttm_bo_cleanup_refs Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/ttm/ttm_bo.c | 98 ++++++++++++++------------------ drivers/gpu/drm/ttm/ttm_execbuf_util.c | 1 2 files changed, 45 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 9b95e96..34d4bb1 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -295,6 +295,8 @@ int ttm_bo_reserve(struct ttm_buffer_object *bo, int put_count = 0; int ret; + WARN_ON(!list_empty_careful(&bo->ddestroy)); + spin_lock(&glob->lru_lock); ret = ttm_bo_reserve_locked(bo, interruptible, no_wait, use_sequence, sequence); @@ -522,14 +524,15 @@ queue: * If bo idle, remove from delayed- and lru lists, and unref. * If not idle, do nothing. * + * Must be called with lru_lock and reservation held, this function + * will drop both before returning. + * * @interruptible Any sleeps should occur interruptibly. - * @no_wait_reserve Never wait for reserve. Return -EBUSY instead. * @no_wait_gpu Never wait for gpu. Return -EBUSY instead. */ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, bool interruptible, - bool no_wait_reserve, bool no_wait_gpu) { struct ttm_bo_device *bdev = bo->bdev; @@ -537,53 +540,45 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, struct ttm_bo_global *glob = bo->glob; int put_count; int ret = 0; - void *sync_obj; - -retry: - spin_lock(&glob->lru_lock); - ret = ttm_bo_reserve_locked(bo, interruptible, - no_wait_reserve, false, 0); - - if (unlikely(ret != 0)) { - spin_unlock(&glob->lru_lock); - return ret; - } + ret = ttm_bo_wait(bo, false, false, true); - if (unlikely(list_empty(&bo->ddestroy))) { + if (ret && no_wait_gpu) { atomic_set(&bo->reserved, 0); wake_up_all(&bo->event_queue); spin_unlock(&glob->lru_lock); - return 0; - } - ret = ttm_bo_wait(bo, false, false, true); - - if (ret) { - if (no_wait_gpu) { - atomic_set(&bo->reserved, 0); - wake_up_all(&bo->event_queue); - spin_unlock(&glob->lru_lock); - return ret; - } + return ret; + } else if (ret) { + void *sync_obj; /** - * Take a reference to the fence and unreserve, if the wait - * was succesful and no new sync_obj was attached, - * ttm_bo_wait in retry will return ret = 0, and end the loop. + * Take a reference to the fence and unreserve, + * at this point the buffer should be dead, so + * no new sync objects can be attached. */ - sync_obj = driver->sync_obj_ref(&bo->sync_obj); atomic_set(&bo->reserved, 0); wake_up_all(&bo->event_queue); spin_unlock(&glob->lru_lock); - ret = driver->sync_obj_wait(bo->sync_obj, false, interruptible); + ret = driver->sync_obj_wait(sync_obj, false, interruptible); driver->sync_obj_unref(&sync_obj); if (ret) return ret; - goto retry; + spin_lock(&glob->lru_lock); + } else { + atomic_set(&bo->reserved, 0); + wake_up_all(&bo->event_queue); + } + + if (unlikely(list_empty(&bo->ddestroy))) { + spin_unlock(&glob->lru_lock); + return 0; } + if (bo->sync_obj) + driver->sync_obj_unref(&bo->sync_obj); + put_count = ttm_bo_del_from_lru(bo); list_del_init(&bo->ddestroy); ++put_count; @@ -625,9 +620,12 @@ static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all) kref_get(&nentry->list_kref); } - spin_unlock(&glob->lru_lock); - ret = ttm_bo_cleanup_refs(entry, false, !remove_all, - !remove_all); + ret = ttm_bo_reserve_locked(entry, false, !remove_all); + if (ret) + spin_unlock(&glob->lru_lock); + else + ret = ttm_bo_cleanup_refs(entry, false, !remove_all); + kref_put(&entry->list_kref, ttm_bo_release_list); entry = nentry; @@ -778,18 +776,6 @@ retry: bo = list_first_entry(&man->lru, struct ttm_buffer_object, lru); kref_get(&bo->list_kref); - if (!list_empty(&bo->ddestroy)) { - spin_unlock(&glob->lru_lock); - ret = ttm_bo_cleanup_refs(bo, interruptible, - true, no_wait_gpu); - kref_put(&bo->list_kref, ttm_bo_release_list); - - if (likely(ret == 0 || ret == -ERESTARTSYS)) - return ret; - - goto retry; - } - ret = ttm_bo_reserve_locked(bo, false, true, false, 0); if (unlikely(ret == -EBUSY)) { @@ -808,6 +794,12 @@ retry: goto retry; } + if (!list_empty(&bo->ddestroy)) { + ret = ttm_bo_cleanup_refs(bo, interruptible, no_wait_gpu); + kref_put(&bo->list_kref, ttm_bo_release_list); + return ret; + } + put_count = ttm_bo_del_from_lru(bo); spin_unlock(&glob->lru_lock); @@ -1718,14 +1710,6 @@ static int ttm_bo_swapout(struct ttm_mem_shrink *shrink) struct ttm_buffer_object, swap); kref_get(&bo->list_kref); - if (!list_empty(&bo->ddestroy)) { - spin_unlock(&glob->lru_lock); - (void) ttm_bo_cleanup_refs(bo, false, false, false); - kref_put(&bo->list_kref, ttm_bo_release_list); - spin_lock(&glob->lru_lock); - continue; - } - /** * Reserve buffer. Since we unlock while sleeping, we need * to re-check that nobody removed us from the swap-list while @@ -1741,6 +1725,12 @@ static int ttm_bo_swapout(struct ttm_mem_shrink *shrink) } } + if (!list_empty(&bo->ddestroy)) { + ret = ttm_bo_cleanup_refs(bo, false, false); + kref_put(&bo->list_kref, ttm_bo_release_list); + return ret; + } + BUG_ON(ret != 0); put_count = ttm_bo_del_from_lru(bo); spin_unlock(&glob->lru_lock); diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c index 51b5e97..3ea2457 100644 --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c @@ -153,6 +153,7 @@ retry: struct ttm_buffer_object *bo = entry->bo; retry_this_bo: + WARN_ON(!list_empty_careful(&bo->ddestroy)); ret = ttm_bo_reserve_locked(bo, true, true, true, val_seq); switch (ret) { case 0: