diff mbox

[4/4] drm/ttm: Optimize reservation slightly

Message ID 1352122278-12896-5-git-send-email-thellstrom@vmware.com (mailing list archive)
State New, archived
Headers show

Commit Message

Thomas Hellstrom Nov. 5, 2012, 1:31 p.m. UTC
Reservation locking currently always takes place under the LRU spinlock.
Hence, strictly there is no need for an atomic_cmpxchg call; we can use
atomic_read followed by atomic_write since nobody else will ever reserve
without the lru spinlock held.
At least on Intel this should remove a locked bus cycle on successful
reserve.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

Comments

Maarten Lankhorst Nov. 5, 2012, 2:01 p.m. UTC | #1
Hey,

Op 05-11-12 14:31, Thomas Hellstrom schreef:
> Reservation locking currently always takes place under the LRU spinlock.
> Hence, strictly there is no need for an atomic_cmpxchg call; we can use
> atomic_read followed by atomic_write since nobody else will ever reserve
> without the lru spinlock held.
> At least on Intel this should remove a locked bus cycle on successful
> reserve.
>
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
>
Is that really a good thing to submit when I am busy killing lru lock around reserve? :-)

-	while (unlikely(atomic_cmpxchg(&bo->reserved, 0, 1) != 0)) {
+	while (unlikely(atomic_xchg(&bo->reserved, 1) != 0)) {

Works without lru lock too!

In fact mutexes are done in a similar way[1], except with some more magic, and unlocked state is 1, not 0.
However I do think that to get that right (saves a irq disable in unlock path, and less wakeups in contended
case), I should really just post the mutex extension patches for reservations and ride the flames. It's
getting too close to real mutexes so I really want it to be a mutex in that case. So lets convert it.. Soon! :-)

~Maarten

[1] See linux/include/asm-generic/mutex-xchg.h and linux/include/asm-generic/mutex-dec.h for how
archs generally implement mutex fastpaths.
Thomas Hellstrom Nov. 5, 2012, 2:09 p.m. UTC | #2
On 11/05/2012 03:01 PM, Maarten Lankhorst wrote:
> Hey,
>
> Op 05-11-12 14:31, Thomas Hellstrom schreef:
>> Reservation locking currently always takes place under the LRU spinlock.
>> Hence, strictly there is no need for an atomic_cmpxchg call; we can use
>> atomic_read followed by atomic_write since nobody else will ever reserve
>> without the lru spinlock held.
>> At least on Intel this should remove a locked bus cycle on successful
>> reserve.
>>
>> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
>>
> Is that really a good thing to submit when I am busy killing lru lock around reserve? :-)
>

If your patch series makes it into the same kernel, let's kill this 
patch. Otherwise it may live
at least for a kernel release. It's not a big thing to rebase against, 
and I won't complain if your
patch adds another atomic read-modify-write op here. :)

/Thomas
diff mbox

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bf6e4b5..46008ea 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -220,7 +220,7 @@  int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 	struct ttm_bo_global *glob = bo->glob;
 	int ret;
 
-	while (unlikely(atomic_cmpxchg(&bo->reserved, 0, 1) != 0)) {
+	while (unlikely(atomic_read(&bo->reserved) != 0)) {
 		/**
 		 * Deadlock avoidance for multi-bo reserving.
 		 */
@@ -249,6 +249,7 @@  int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 			return ret;
 	}
 
+	atomic_set(&bo->reserved, 1);
 	if (use_sequence) {
 		/**
 		 * Wake up waiters that may need to recheck for deadlock,