From patchwork Wed Mar 10 16:20:43 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pauli Nieminen X-Patchwork-Id: 84637 Received: from lists.sourceforge.net (lists.sourceforge.net [216.34.181.88]) by demeter.kernel.org (8.14.3/8.14.3) with ESMTP id o2AGfSm3022842 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 10 Mar 2010 16:42:04 GMT Received: from localhost ([127.0.0.1] helo=sfs-ml-1.v29.ch3.sourceforge.com) by sfs-ml-1.v29.ch3.sourceforge.com with esmtp (Exim 4.69) (envelope-from ) id 1NpOwS-0007ng-9R; Wed, 10 Mar 2010 16:39:28 +0000 Received: from sfi-mx-3.v28.ch3.sourceforge.com ([172.29.28.123] helo=mx.sourceforge.net) by sfs-ml-1.v29.ch3.sourceforge.com with esmtp (Exim 4.69) (envelope-from ) id 1NpOwR-0007nX-3M for dri-devel@lists.sourceforge.net; Wed, 10 Mar 2010 16:39:27 +0000 Received-SPF: neutral (sfi-mx-3.v28.ch3.sourceforge.com: 213.243.153.184 is neither permitted nor denied by domain of gmail.com) client-ip=213.243.153.184; envelope-from=suokkos@gmail.com; helo=filtteri1.pp.htv.fi; Received: from filtteri1.pp.htv.fi ([213.243.153.184]) by sfi-mx-3.v28.ch3.sourceforge.com with esmtp (Exim 4.69) id 1NpOwP-0007Q3-Qb for dri-devel@lists.sourceforge.net; Wed, 10 Mar 2010 16:39:27 +0000 Received: from localhost (localhost [127.0.0.1]) by filtteri1.pp.htv.fi (Postfix) with ESMTP id 69EDF8BBD3; Wed, 10 Mar 2010 18:21:02 +0200 (EET) X-Virus-Scanned: Debian amavisd-new at pp.htv.fi Received: from smtp5.welho.com ([213.243.153.39]) by localhost (filtteri1.pp.htv.fi [213.243.153.184]) (amavisd-new, port 10024) with ESMTP id 7HPU8EfJLGpG; Wed, 10 Mar 2010 18:21:01 +0200 (EET) Received: from localhost.localdomain (cs181130083.pp.htv.fi [82.181.130.83]) by smtp5.welho.com (Postfix) with ESMTP id BE9B65BC008; Wed, 10 Mar 2010 18:21:01 +0200 (EET) From: Pauli Nieminen To: dri-devel@lists.sourceforge.net Subject: [PATCH 2/2] libdrm_radeon: Optimize reloc writing to do less looping. Date: Wed, 10 Mar 2010 18:20:43 +0200 Message-Id: <1268238043-4453-2-git-send-email-suokkos@gmail.com> X-Mailer: git-send-email 1.6.3.3 In-Reply-To: <1268238043-4453-1-git-send-email-suokkos@gmail.com> References: <1268238043-4453-1-git-send-email-suokkos@gmail.com> X-Spam-Score: 1.2 (+) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. 1.2 SPF_NEUTRAL SPF: sender does not match SPF record (neutral) X-Headers-End: 1NpOwP-0007Q3-Qb Cc: intel-gfx@lists.freedesktop.org X-BeenThere: dri-devel@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.sourceforge.net X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (demeter.kernel.org [140.211.167.41]); Wed, 10 Mar 2010 16:42:04 +0000 (UTC) diff --git a/radeon/radeon_bo_gem.c b/radeon/radeon_bo_gem.c index bc8058d..1b33bdb 100644 --- a/radeon/radeon_bo_gem.c +++ b/radeon/radeon_bo_gem.c @@ -80,6 +80,7 @@ static struct radeon_bo *bo_open(struct radeon_bo_manager *bom, bo->base.domains = domains; bo->base.flags = flags; bo->base.ptr = NULL; + bo->base.referenced_in_cs = 0; bo->map_count = 0; if (handle) { struct drm_gem_open open_arg; diff --git a/radeon/radeon_cs.c b/radeon/radeon_cs.c index cc9be39..d0e922b 100644 --- a/radeon/radeon_cs.c +++ b/radeon/radeon_cs.c @@ -88,3 +88,9 @@ void radeon_cs_space_set_flush(struct radeon_cs *cs, void (*fn)(void *), void *d csi->space_flush_fn = fn; csi->space_flush_data = data; } + +uint32_t radeon_cs_get_id(struct radeon_cs *cs) +{ + struct radeon_cs_int *csi = (struct radeon_cs_int *)cs; + return csi->id; +} diff --git a/radeon/radeon_cs.h b/radeon/radeon_cs.h index 49d5d9a..7f6ee68 100644 --- a/radeon/radeon_cs.h +++ b/radeon/radeon_cs.h @@ -85,7 +85,7 @@ extern int radeon_cs_write_reloc(struct radeon_cs *cs, uint32_t read_domain, uint32_t write_domain, uint32_t flags); - +extern uint32_t radeon_cs_get_id(struct radeon_cs *cs); /* * add a persistent BO to the list * a persistent BO is one that will be referenced across flushes, diff --git a/radeon/radeon_cs_gem.c b/radeon/radeon_cs_gem.c index 45a219c..83aabea 100644 --- a/radeon/radeon_cs_gem.c +++ b/radeon/radeon_cs_gem.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include "radeon_cs.h" @@ -68,6 +69,66 @@ struct cs_gem { struct radeon_bo_int **relocs_bo; }; + +#if !defined(__GNUC__) || __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2) +/* no built in sync support in compiler define place holders */ +uint32_t __sync_add_and_fetch(uint32_t *a, uint32_t val) +{ + *a += val; + return val; +} + +uint32_t __sync_add_and_fetch(uint32_t *a, uint32_t val) +{ + *a -= val; + return val; +} +#endif + +static pthread_mutex_t id_mutex = PTHREAD_MUTEX_INITIALIZER; +static uint32_t cs_id_source = 0; + +/** + * result is undefined if called with ~0 + */ +static uint32_t get_first_zero(const uint32_t n) +{ + /* __builtin_ctz returns number of trailing zeros. */ + return 1 << __builtin_ctz(~n); +} + +/** + * Returns a free id for cs. + * If there is no free id we return zero + **/ +static uint32_t generate_id(void) +{ + uint32_t r = 0; + pthread_mutex_lock( &id_mutex ); + /* check for free ids */ + if (cs_id_source != ~r) { + /* find first zero bit */ + r = get_first_zero(cs_id_source); + + /* set id as reserved */ + cs_id_source |= r; + } + pthread_mutex_unlock( &id_mutex ); + return r; +} + +/** + * Free the id for later reuse + **/ +static void free_id(uint32_t id) +{ + pthread_mutex_lock( &id_mutex ); + + cs_id_source &= ~id; + + pthread_mutex_unlock( &id_mutex ); +} + static struct radeon_cs_int *cs_gem_create(struct radeon_cs_manager *csm, uint32_t ndw) { @@ -90,6 +151,7 @@ static struct radeon_cs_int *cs_gem_create(struct radeon_cs_manager *csm, } csg->base.relocs_total_size = 0; csg->base.crelocs = 0; + csg->base.id = generate_id(); csg->nrelocs = 4096 / (4 * 4) ; csg->relocs_bo = (struct radeon_bo_int**)calloc(1, csg->nrelocs*sizeof(void*)); @@ -141,38 +203,43 @@ static int cs_gem_write_reloc(struct radeon_cs_int *cs, if (write_domain == RADEON_GEM_DOMAIN_CPU) { return -EINVAL; } - /* check if bo is already referenced */ - for(i = 0; i < cs->crelocs; i++) { - idx = i * RELOC_SIZE; - reloc = (struct cs_reloc_gem*)&csg->relocs[idx]; - if (reloc->handle == bo->handle) { - /* Check domains must be in read or write. As we check already - * checked that in argument one of the read or write domain was - * set we only need to check that if previous reloc as the read - * domain set then the read_domain should also be set for this - * new relocation. - */ - /* the DDX expects to read and write from same pixmap */ - if (write_domain && (reloc->read_domain & write_domain)) { - reloc->read_domain = 0; - reloc->write_domain = write_domain; - } else if (read_domain & reloc->write_domain) { - reloc->read_domain = 0; - } else { - if (write_domain != reloc->write_domain) - return -EINVAL; - if (read_domain != reloc->read_domain) - return -EINVAL; + /* use bit field hash functionto determine + if this bo is for sure not in this cs.*/ + if ((boi->referenced_in_cs & cs->id)) { + /* check if bo is already referenced */ + for(i = cs->crelocs; i != 0;) { + --i; + idx = i * RELOC_SIZE; + reloc = (struct cs_reloc_gem*)&csg->relocs[idx]; + if (reloc->handle == bo->handle) { + /* Check domains must be in read or write. As we check already + * checked that in argument one of the read or write domain was + * set we only need to check that if previous reloc as the read + * domain set then the read_domain should also be set for this + * new relocation. + */ + /* the DDX expects to read and write from same pixmap */ + if (write_domain && (reloc->read_domain & write_domain)) { + reloc->read_domain = 0; + reloc->write_domain = write_domain; + } else if (read_domain & reloc->write_domain) { + reloc->read_domain = 0; + } else { + if (write_domain != reloc->write_domain) + return -EINVAL; + if (read_domain != reloc->read_domain) + return -EINVAL; + } + + reloc->read_domain |= read_domain; + reloc->write_domain |= write_domain; + /* update flags */ + reloc->flags |= (flags & reloc->flags); + /* write relocation packet */ + radeon_cs_write_dword((struct radeon_cs *)cs, 0xc0001000); + radeon_cs_write_dword((struct radeon_cs *)cs, idx); + return 0; } - - reloc->read_domain |= read_domain; - reloc->write_domain |= write_domain; - /* update flags */ - reloc->flags |= (flags & reloc->flags); - /* write relocation packet */ - radeon_cs_write_dword((struct radeon_cs *)cs, 0xc0001000); - radeon_cs_write_dword((struct radeon_cs *)cs, idx); - return 0; } } /* new relocation */ @@ -203,6 +270,7 @@ static int cs_gem_write_reloc(struct radeon_cs_int *cs, reloc->flags = flags; csg->chunks[1].length_dw += RELOC_SIZE; radeon_bo_ref(bo); + __sync_add_and_fetch(&boi->referenced_in_cs, cs->id); cs->relocs_total_size += boi->size; radeon_cs_write_dword((struct radeon_cs *)cs, 0xc0001000); radeon_cs_write_dword((struct radeon_cs *)cs, idx); @@ -288,6 +356,7 @@ static int cs_gem_emit(struct radeon_cs_int *cs) &csg->cs, sizeof(struct drm_radeon_cs)); for (i = 0; i < csg->base.crelocs; i++) { csg->relocs_bo[i]->space_accounted = 0; + __sync_sub_and_fetch(&csg->relocs_bo[i]->referenced_in_cs, cs->id); radeon_bo_unref((struct radeon_bo *)csg->relocs_bo[i]); csg->relocs_bo[i] = NULL; } @@ -302,6 +371,7 @@ static int cs_gem_destroy(struct radeon_cs_int *cs) { struct cs_gem *csg = (struct cs_gem*)cs; + free_id(cs->id); free(csg->relocs_bo); free(cs->relocs); free(cs->packets); @@ -317,6 +387,7 @@ static int cs_gem_erase(struct radeon_cs_int *cs) if (csg->relocs_bo) { for (i = 0; i < csg->base.crelocs; i++) { if (csg->relocs_bo[i]) { + __sync_sub_and_fetch(&csg->relocs_bo[i]->referenced_in_cs, cs->id); radeon_bo_unref((struct radeon_bo *)csg->relocs_bo[i]); csg->relocs_bo[i] = NULL; } diff --git a/radeon/radeon_cs_int.h b/radeon/radeon_cs_int.h index 8ba76bf..6cee574 100644 --- a/radeon/radeon_cs_int.h +++ b/radeon/radeon_cs_int.h @@ -28,6 +28,7 @@ struct radeon_cs_int { int bo_count; void (*space_flush_fn)(void *); void *space_flush_data; + uint32_t id; }; /* cs functions */