From patchwork Wed Mar 17 20:50:00 2010
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Pauli Nieminen <suokkos@gmail.com>
X-Patchwork-Id: 86513
Received: from lists.sourceforge.net (lists.sourceforge.net [216.34.181.88])
	by demeter.kernel.org (8.14.3/8.14.3) with ESMTP id o2HKpivL022830
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
	for <patchwork-dri-devel@patchwork.kernel.org>;
	Wed, 17 Mar 2010 20:52:21 GMT
Received: from localhost ([127.0.0.1] helo=sfs-ml-4.v29.ch3.sourceforge.com)
	by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.69)
	(envelope-from <dri-devel-bounces@lists.sourceforge.net>)
	id 1Ns0CD-0007ef-Oh; Wed, 17 Mar 2010 20:50:29 +0000
Received: from sfi-mx-3.v28.ch3.sourceforge.com ([172.29.28.123]
	helo=mx.sourceforge.net)
	by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.69)
	(envelope-from <suokkos@gmail.com>) id 1Ns0CC-0007eS-7e
	for dri-devel@lists.sourceforge.net; Wed, 17 Mar 2010 20:50:28 +0000
Received-SPF: neutral (sfi-mx-3.v28.ch3.sourceforge.com: 213.243.153.184 is
	neither permitted nor denied by domain of gmail.com)
	client-ip=213.243.153.184; envelope-from=suokkos@gmail.com;
	helo=filtteri1.pp.htv.fi;
Received: from filtteri1.pp.htv.fi ([213.243.153.184])
	by sfi-mx-3.v28.ch3.sourceforge.com with esmtp (Exim 4.69)
	id 1Ns0CA-0006xM-CW
	for dri-devel@lists.sourceforge.net; Wed, 17 Mar 2010 20:50:28 +0000
Received: from localhost (localhost [127.0.0.1])
	by filtteri1.pp.htv.fi (Postfix) with ESMTP id 73A028BBAE;
	Wed, 17 Mar 2010 22:50:19 +0200 (EET)
X-Virus-Scanned: Debian amavisd-new at pp.htv.fi
Received: from smtp6.welho.com ([213.243.153.40])
	by localhost (filtteri1.pp.htv.fi [213.243.153.184]) (amavisd-new,
	port 10024)
	with ESMTP id C3iZOD9txUG9; Wed, 17 Mar 2010 22:50:18 +0200 (EET)
Received: from localhost.localdomain (cs181130083.pp.htv.fi [82.181.130.83])
	by smtp6.welho.com (Postfix) with ESMTP id BCA835BC002;
	Wed, 17 Mar 2010 22:50:17 +0200 (EET)
From: Pauli Nieminen <suokkos@gmail.com>
To: dri-devel@lists.sourceforge.net
Subject: [PATCH 1/7] drm/ttm: add pool wc/uc page allocator
Date: Wed, 17 Mar 2010 22:50:00 +0200
Message-Id: <1268859006-18707-2-git-send-email-suokkos@gmail.com>
X-Mailer: git-send-email 1.6.3.3
In-Reply-To: <1268859006-18707-1-git-send-email-suokkos@gmail.com>
References: <1268859006-18707-1-git-send-email-suokkos@gmail.com>
X-Spam-Score: 1.0 (+)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	1.2 SPF_NEUTRAL SPF: sender does not match SPF record (neutral)
	-0.2 AWL AWL: From: address is in the auto white-list
X-Headers-End: 1Ns0CA-0006xM-CW
Cc: Dave Airlie <airlied@redhat.com>, Jerome Glisse <jglisse@redhat.com>
X-BeenThere: dri-devel@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Direct Rendering Infrastructure - Development
	<dri-devel.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: 
 <http://sourceforge.net/mailarchive/forum.php?forum_name=dri-devel>
List-Post: <mailto:dri-devel@lists.sourceforge.net>
List-Help: <mailto:dri-devel-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/dri-devel>,
	<mailto:dri-devel-request@lists.sourceforge.net?subject=subscribe>
MIME-Version: 1.0
Errors-To: dri-devel-bounces@lists.sourceforge.net
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by
	milter-greylist-4.2.3 (demeter.kernel.org [140.211.167.41]);
	Wed, 17 Mar 2010 20:52:22 +0000 (UTC)


diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile
index 1e138f5..4256e20 100644
--- a/drivers/gpu/drm/ttm/Makefile
+++ b/drivers/gpu/drm/ttm/Makefile
@@ -4,6 +4,6 @@
 ccflags-y := -Iinclude/drm
 ttm-y := ttm_agp_backend.o ttm_memory.o ttm_tt.o ttm_bo.o \
 	ttm_bo_util.o ttm_bo_vm.o ttm_module.o ttm_global.o \
-	ttm_object.o ttm_lock.o ttm_execbuf_util.o
+	ttm_object.o ttm_lock.o ttm_execbuf_util.o ttm_page_alloc.o
 
 obj-$(CONFIG_DRM_TTM) += ttm.o
diff --git a/drivers/gpu/drm/ttm/ttm_memory.c b/drivers/gpu/drm/ttm/ttm_memory.c
index eb143e0..e4c7cea 100644
--- a/drivers/gpu/drm/ttm/ttm_memory.c
+++ b/drivers/gpu/drm/ttm/ttm_memory.c
@@ -27,6 +27,7 @@
 
 #include "ttm/ttm_memory.h"
 #include "ttm/ttm_module.h"
+#include "ttm/ttm_page_alloc.h"
 #include <linux/spinlock.h>
 #include <linux/sched.h>
 #include <linux/wait.h>
@@ -394,6 +395,7 @@ int ttm_mem_global_init(struct ttm_mem_global *glob)
 		       "Zone %7s: Available graphics memory: %llu kiB.\n",
 		       zone->name, (unsigned long long) zone->max_mem >> 10);
 	}
+	ttm_page_alloc_init(glob);
 	return 0;
 out_no_zone:
 	ttm_mem_global_release(glob);
@@ -406,6 +408,9 @@ void ttm_mem_global_release(struct ttm_mem_global *glob)
 	unsigned int i;
 	struct ttm_mem_zone *zone;
 
+	/* let the page allocator first stop the shrink work. */
+	ttm_page_alloc_fini();
+
 	flush_workqueue(glob->swap_queue);
 	destroy_workqueue(glob->swap_queue);
 	glob->swap_queue = NULL;
@@ -413,7 +418,7 @@ void ttm_mem_global_release(struct ttm_mem_global *glob)
 		zone = glob->zones[i];
 		kobject_del(&zone->kobj);
 		kobject_put(&zone->kobj);
-	}
+			}
 	kobject_del(&glob->kobj);
 	kobject_put(&glob->kobj);
 }
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
new file mode 100644
index 0000000..768d479
--- /dev/null
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -0,0 +1,775 @@
+/*
+ * Copyright (c) Red Hat Inc.
+
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sub license,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Dave Airlie <airlied@redhat.com>
+ *          Jerome Glisse <jglisse@redhat.com>
+ *          Pauli Nieminen <suokkos@gmail.com>
+ */
+
+/* simple list based uncached page pool
+ * - Pool collects resently freed pages for reuse
+ * - Use page->lru to keep a free list
+ * - doesn't track currently in use pages
+ */
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/highmem.h>
+#include <linux/mm_types.h>
+#include <linux/jiffies.h>
+#include <linux/timer.h>
+#include <linux/workqueue.h>
+
+#include <asm/atomic.h>
+#include <asm/agp.h>
+
+#include "ttm/ttm_bo_driver.h"
+#include "ttm/ttm_page_alloc.h"
+
+
+#define NUM_PAGES_TO_ALLOC		256
+#define SMALL_ALLOCATION		64
+#define FREE_ALL_PAGES			1
+/* times are in msecs */
+#define PAGE_FREE_INTERVAL		1000
+
+/**
+ * struct ttm_page_pool - Pool to reuse recently allocated uc/wc pages.
+ *
+ * @lock: Protects the shared pool from concurrnet access. Must be used with
+ * irqsave/irqrestore variants because pool allocator maybe called from
+ * delayed work.
+ * @fill_lock: Prevent concurrent calls to fill.
+ * @list: Pool of free uc/wc pages for fast reuse
+ * @gfp_flags: Flags to pass for alloc_page.
+ * @npages: Number of pages in pool
+ * @nlowpages: Minimum nubmer of pages in pool since previous shrink
+ * @alloc_size: Allocation sizes of this pool.
+ * operation.
+ */
+struct ttm_page_pool {
+	spinlock_t		lock;
+	bool			fill_lock;
+	struct list_head	list;
+	int			gfp_flags;
+	unsigned		npages;
+	unsigned		nlowpages;
+	unsigned		alloc_size;
+};
+
+#define NUM_POOLS 4
+
+/**
+ * struct ttm_pool_manager - Holds memory pools for fst allocation
+ *
+ * Manager is read only object for pool code so it doesn't need locking.
+ *
+ * @free_interval: minimum number of jiffies between freeing pages from pool.
+ * @glob: Global memory object for shrinker registeration.
+ * @page_alloc_inited: reference counting for pool allocation.
+ * @work: Work that is used to shrink the pool. Work is only run when there is
+ * some pages to free.
+ * @small_allocation: Limit in number of pages what is small allocation.
+ *
+ * @pools: All pool objects in use.
+ **/
+struct ttm_pool_manager {
+	unsigned long		free_interval;
+	struct ttm_mem_global	*glob;
+	atomic_t		page_alloc_inited;
+	struct delayed_work	work;
+	unsigned		small_allocation;
+
+	union {
+		struct ttm_page_pool	pools[NUM_POOLS];
+		struct {
+			struct ttm_page_pool	wc_pool;
+			struct ttm_page_pool	uc_pool;
+			struct ttm_page_pool	wc_pool_dma32;
+			struct ttm_page_pool	uc_pool_dma32;
+		} ;
+	};
+};
+
+static struct ttm_pool_manager _manager = {
+	.page_alloc_inited	= ATOMIC_INIT(0)
+};
+
+#ifdef CONFIG_X86
+/* TODO: add this to x86 like _uc, this version here is inefficient */
+static int set_pages_array_wc(struct page **pages, int addrinarray)
+{
+	int i;
+
+	for (i = 0; i < addrinarray; i++)
+		set_memory_wc((unsigned long)page_address(pages[i]), 1);
+	return 0;
+}
+#else
+static int set_pages_array_wb(struct page **pages, int addrinarray)
+{
+#ifdef TTM_HAS_AGP
+	int i;
+
+	for (i = 0; i < addrinarray; i++)
+		unmap_page_from_agp(pages[i]);
+#endif
+	return 0;
+}
+
+static int set_pages_array_wc(struct page **pages, int addrinarray)
+{
+#ifdef TTM_HAS_AGP
+	int i;
+
+	for (i = 0; i < addrinarray; i++)
+		map_page_into_agp(pages[i]);
+#endif
+	return 0;
+}
+
+static int set_pages_array_uc(struct page **pages, int addrinarray)
+{
+#ifdef TTM_HAS_AGP
+	int i;
+
+	for (i = 0; i < addrinarray; i++)
+		map_page_into_agp(pages[i]);
+#endif
+	return 0;
+}
+#endif
+
+/**
+ * Select the right pool or requested caching state and ttm flags. */
+static struct ttm_page_pool *ttm_get_pool(int flags,
+		enum ttm_caching_state cstate)
+{
+	int pool_index;
+
+	if (cstate == tt_cached)
+		return NULL;
+
+	if (cstate == tt_wc)
+		pool_index = 0x0;
+	else
+		pool_index = 0x1;
+
+	if (flags & TTM_PAGE_FLAG_DMA32)
+		pool_index |= 0x2;
+
+	return &_manager.pools[pool_index];
+}
+
+/* set memory back to wb and free the pages. */
+static void ttm_pages_put(struct page *pages[], unsigned npages)
+{
+	unsigned i;
+	if (set_pages_array_wb(pages, npages))
+		printk(KERN_ERR "[ttm] Failed to set %d pages to wb!\n",
+				npages);
+	for (i = 0; i < npages; ++i)
+		__free_page(pages[i]);
+}
+
+/**
+ * reset nlowpages after all pools have been cleaned in this run.
+ **/
+static bool ttm_reset_pools(struct ttm_pool_manager *manager)
+{
+	unsigned long irq_flags;
+	bool pages_in_pool = false;
+	unsigned i;
+	for (i = 0; i < NUM_POOLS; ++i) {
+		spin_lock_irqsave(&manager->pools[i].lock, irq_flags);
+		manager->pools[i].nlowpages = manager->pools[i].npages;
+		pages_in_pool = pages_in_pool
+			|| manager->pools[i].npages > manager->pools[i].alloc_size;
+		spin_unlock_irqrestore(&manager->pools[i].lock, irq_flags);
+	}
+	return pages_in_pool;
+}
+
+/**
+ * Calculate amount of pages to free from pool in this run.
+ *
+ * Must be called with pool lock held.
+ **/
+static unsigned ttm_page_pool_get_npages_to_free_locked(struct ttm_page_pool *pool)
+{
+	unsigned r;
+	/* If less than alloc sizes was the lowest number of pages we don't
+	 * free any */
+	if (pool->nlowpages < pool->alloc_size)
+		return 0;
+	/* leave half of unused pages to pool */
+	r = (pool->nlowpages - pool->alloc_size)/2;
+	if (r)
+		return r;
+	/* make sure we remove all pages even when there is rounding down */
+	if (pool->nlowpages)
+		return 1;
+	return 0;
+}
+
+/**
+ * Update pool counters match pool state after freeing pages.
+ *
+ * Must be called with pool lock held.
+ */
+static bool ttm_page_pool_free_pages_locked(struct ttm_page_pool *pool,
+		unsigned freed_pages)
+{
+	unsigned tmp;
+	pool->npages -= freed_pages;
+	/* Calculate number of pages taken from nlowpages
+	 * npages_to_free = 1/2*nlowpages =>
+	 * nlowpages_delta = 2*freed_pages
+	 */
+	tmp = 2*freed_pages;
+	/* protect against rounding errors */
+	if (tmp < pool->nlowpages) {
+		pool->nlowpages -= tmp;
+		return true;
+	}
+
+	pool->nlowpages = 0;
+	return false;
+}
+
+/**
+ * Free pages from pool.
+ *
+ * To prevent hogging the ttm_swap process we only free NUM_PAGES_TO_ALLOC
+ * number of pages in one go.
+ *
+ * @pool: to free the pages from
+ * @free_all: If set to true will free all pages in pool
+ **/
+static bool ttm_page_pool_free(struct ttm_page_pool *pool, const int free_all)
+{
+	unsigned long irq_flags;
+	struct page *p;
+	struct page **pages_to_free;
+	unsigned freed_pages, npages_to_free;
+	bool more_work = false;
+
+	pages_to_free = kmalloc(NUM_PAGES_TO_ALLOC * sizeof(struct page *),
+			GFP_KERNEL);
+	if (!pages_to_free) {
+		printk(KERN_ERR "Failed to allocate memory for pool free operation.\n");
+		return true;
+	}
+
+restart:
+	spin_lock_irqsave(&pool->lock, irq_flags);
+
+	npages_to_free = ttm_page_pool_get_npages_to_free_locked(pool);
+
+	freed_pages = 0;
+	if (unlikely(free_all))
+		npages_to_free = pool->npages;
+
+	list_for_each_entry_reverse(p, &pool->list, lru) {
+		if (freed_pages >= npages_to_free)
+			break;
+
+		pages_to_free[freed_pages++] = p;
+		/* We can only remove NUM_PAGES_TO_ALLOC at a time. */
+		if (freed_pages >= NUM_PAGES_TO_ALLOC) {
+			/* remove range of page sfrom the pool */
+			__list_del(p->lru.prev, &pool->list);
+			/* update pool to counters match what is in pool.
+			 * return value tells us if we are finnished with this
+			 * free operation.
+			 **/
+			more_work = ttm_page_pool_free_pages_locked(pool, freed_pages);
+			/**
+			 * Because changing page caching is costly
+			 * we unlock the pool to prevent stalling.
+			 */
+			spin_unlock_irqrestore(&pool->lock, irq_flags);
+
+			ttm_pages_put(pages_to_free, freed_pages);
+
+			/* free all so restart the processing */
+			if (unlikely(free_all))
+				goto restart;
+			/* Now out of here to let others jobs run in ttm_swap */
+			goto out;
+
+		}
+	}
+
+	pool->npages -= freed_pages;
+	/* set nlowpages to zero to prevent extra freeing in thsi patch.
+	 * nlowpages is reseted later after all work has been finnished.
+	 **/
+	pool->nlowpages = 0;
+
+	/* remove range of pages from the pool */
+	if (freed_pages)
+		__list_del(&p->lru, &pool->list);
+
+	spin_unlock_irqrestore(&pool->lock, irq_flags);
+
+	if (freed_pages)
+		ttm_pages_put(pages_to_free, freed_pages);
+out:
+	kfree(pages_to_free);
+	return more_work;
+}
+
+/**
+ * Callback for workqueue.
+ *
+ * We limit the work that is done in single go to let others task run too in
+ * shared workqueue. If ttm_page_pool_free signals there is more work to do
+ * we immediately queue new work.
+ *
+ * @w: work structure that can't be used to find the related data.
+ **/
+static void ttm_pool_shrink(struct work_struct *w)
+{
+	struct delayed_work *work =
+		container_of(w, struct delayed_work, work);
+	struct ttm_pool_manager *manager =
+	    container_of(work, struct ttm_pool_manager, work);
+	unsigned i;
+	bool more_work = false;
+
+
+	for (i = 0; i < NUM_POOLS; ++i)
+		more_work = more_work || ttm_page_pool_free(&manager->pools[i], 0);
+
+	/* Queue more work to be done. This forces cleanup to use
+	 * cancel_delayed_work_sync() to break the loop. */
+	if (!more_work) {
+		/* reset the pool counters and queue more work if there is
+		 * pages to free */
+		if (ttm_reset_pools(manager))
+			(void)queue_delayed_work(manager->glob->swap_queue,
+					&manager->work,
+					round_jiffies(manager->free_interval));
+	} else /* Restart work with short delay because there is more work */
+		(void)queue_delayed_work(manager->glob->swap_queue,
+				&manager->work, HZ);
+
+}
+
+
+static int ttm_set_pages_caching(struct page **pages,
+		enum ttm_caching_state cstate, unsigned cpages)
+{
+	int r = 0;
+	/* Set page caching */
+	switch (cstate) {
+	case tt_uncached:
+		r = set_pages_array_uc(pages, cpages);
+		if (r)
+			printk(KERN_ERR "[ttm] Failed to set %d pages to uc!\n",
+					cpages);
+		break;
+	case tt_wc:
+		r = set_pages_array_wc(pages, cpages);
+		if (r)
+			printk(KERN_ERR "[ttm] Failed to set %d pages to wc!\n",
+					cpages);
+		break;
+	default:
+		break;
+	}
+	return r;
+}
+
+/**
+ * Free pages the pages that failed to change the caching state. If there is
+ * any pages that have changed their caching state already put them to the
+ * pool.
+ */
+static void ttm_handle_caching_state_failure(struct list_head *pages,
+		int ttm_flags, enum ttm_caching_state cstate,
+		struct page **failed_pages, unsigned cpages)
+{
+	unsigned i;
+	/* Failed pages has to be reed */
+	for (i = 0; i < cpages; ++i) {
+		list_del(&failed_pages[i]->lru);
+		__free_page(failed_pages[i]);
+	}
+}
+
+/**
+ * Allocate new pages with correct caching.
+ *
+ * This function is reentrant if caller updates count depending on number of
+ * pages returned in pages array.
+ */
+static int ttm_alloc_new_pages(struct list_head *pages, int gfp_flags,
+		int ttm_flags, enum ttm_caching_state cstate, unsigned count)
+{
+	struct page **caching_array;
+	struct page *p;
+	int r = 0;
+	unsigned i, cpages;
+	unsigned max_cpages = min(count,
+			(unsigned)(PAGE_SIZE/sizeof(struct page *)));
+
+	/* allocate array for page caching change */
+	caching_array = kmalloc(max_cpages*sizeof(struct page *), GFP_KERNEL);
+
+	if (!caching_array) {
+		printk(KERN_ERR "[ttm] unable to allocate table for new pages.");
+		return -ENOMEM;
+	}
+
+	for (i = 0, cpages = 0; i < count; ++i) {
+		p = alloc_page(gfp_flags);
+
+		if (!p) {
+			printk(KERN_ERR "[ttm] unable to get page %u\n", i);
+
+			/* store already allocated pages in the pool after
+			 * setting the caching state */
+			if (cpages) {
+				r = ttm_set_pages_caching(caching_array, cstate, cpages);
+				if (r)
+					ttm_handle_caching_state_failure(pages,
+						ttm_flags, cstate,
+						caching_array, cpages);
+			}
+			r = -ENOMEM;
+			goto out;
+		}
+
+#ifdef CONFIG_HIGHMEM
+		/* gfp flags of highmem page should never be dma32 so we
+		 * we should be fine in such case
+		 */
+		if (!PageHighMem(p))
+#endif
+		{
+			caching_array[cpages++] = p;
+			if (cpages == max_cpages) {
+
+				r = ttm_set_pages_caching(caching_array,
+						cstate, cpages);
+				if (r) {
+					ttm_handle_caching_state_failure(pages,
+						ttm_flags, cstate,
+						caching_array, cpages);
+					goto out;
+				}
+				cpages = 0;
+			}
+		}
+
+		list_add(&p->lru, pages);
+	}
+
+	if (cpages) {
+		r = ttm_set_pages_caching(caching_array, cstate, cpages);
+		if (r)
+			ttm_handle_caching_state_failure(pages,
+					ttm_flags, cstate,
+					caching_array, cpages);
+	}
+out:
+	kfree(caching_array);
+
+	return r;
+}
+
+/**
+ * Fill the given pool if there isn't enough pages and requested number of
+ * pages is small.
+ */
+static void ttm_page_pool_fill_locked(struct ttm_page_pool *pool,
+		int ttm_flags, enum ttm_caching_state cstate, unsigned count,
+		unsigned long irq_flags)
+{
+	struct list_head new_pages;
+	struct page *p, *tmp;
+	int r;
+	unsigned cpages = 0;
+	/**
+	 * Only allow one pool fill operation at a time.
+	 * If pool doesn't have enough pages for the allocation new pages are
+	 * allocated from outside of pool.
+	 */
+	if (pool->fill_lock)
+		return;
+
+	pool->fill_lock = true;
+
+	if (count < _manager.small_allocation
+		&& count > pool->npages) {
+		/* If allocation request is small and there is not enough
+		 * pages in pool we fill the pool first */
+		INIT_LIST_HEAD(&new_pages);
+
+		/**
+		 * Can't change page caching if in irqsave context. We have to
+		 * drop the pool->lock.
+		 */
+		spin_unlock_irqrestore(&pool->lock, irq_flags);
+		r = ttm_alloc_new_pages(&new_pages, pool->gfp_flags, ttm_flags,
+				cstate,	pool->alloc_size);
+		spin_lock_irqsave(&pool->lock, irq_flags);
+
+		if (!r) {
+			list_splice(&new_pages, &pool->list);
+			pool->npages += pool->alloc_size;
+			/* Have to remmber to update the low number of pages
+			 * too */
+			pool->nlowpages += pool->alloc_size;
+		} else {
+			printk(KERN_ERR "[ttm] Failed to fill pool (%p).", pool);
+			/* If we have any pages left put them to the pool. */
+			list_for_each_entry_safe(p, tmp, &pool->list, lru) {
+				++cpages;
+			}
+			list_splice(&new_pages, &pool->list);
+			pool->npages += cpages;
+			pool->nlowpages += cpages;
+		}
+
+	}
+	pool->fill_lock = false;
+}
+
+/**
+ * Cut count nubmer of pages from the pool and put them to return list
+ *
+ * @return count of pages still to allocate to fill the request.
+ */
+static unsigned ttm_page_pool_get_pages(struct ttm_page_pool *pool,
+		struct list_head *pages, int ttm_flags,
+		enum ttm_caching_state cstate, unsigned count)
+{
+	unsigned long irq_flags;
+	struct list_head *p;
+	unsigned i;
+
+	spin_lock_irqsave(&pool->lock, irq_flags);
+	ttm_page_pool_fill_locked(pool, ttm_flags, cstate, count, irq_flags);
+
+	if (count >= pool->npages) {
+		/* take all pages from the pool */
+		list_splice_init(&pool->list, pages);
+		count -= pool->npages;
+		pool->npages = 0;
+		pool->nlowpages = 0;
+		goto out;
+	}
+	/* find the last pages to include for requested number of pages */
+	if (count <= pool->npages/2) {
+		i = 0;
+		list_for_each(p, &pool->list) {
+			if (++i == count)
+				break;
+		}
+	} else {
+		i = pool->npages + 1;
+		list_for_each_prev(p, &pool->list) {
+			if (--i == count)
+				break;
+		}
+	}
+	/* Cut count number of pages from pool */
+	list_cut_position(pages, &pool->list, p);
+	pool->npages -= count;
+	count = 0;
+	if (pool->npages < pool->nlowpages)
+		pool->nlowpages = pool->npages;
+out:
+	spin_unlock_irqrestore(&pool->lock, irq_flags);
+	return count;
+}
+
+/*
+ * On success pages list will hold count number of correctly
+ * cached pages.
+ */
+int ttm_get_pages(struct list_head *pages, int flags,
+		enum ttm_caching_state cstate, unsigned count)
+{
+	struct ttm_page_pool *pool = ttm_get_pool(flags, cstate);
+	struct page *p = NULL;
+	int gfp_flags = 0;
+	int r;
+
+	/* set zero flag for page allocation if required */
+	if (flags & TTM_PAGE_FLAG_ZERO_ALLOC)
+		gfp_flags |= __GFP_ZERO;
+
+	/* No pool for cached pages */
+	if (pool == NULL) {
+		if (flags & TTM_PAGE_FLAG_DMA32)
+			gfp_flags |= GFP_DMA32;
+		else
+			gfp_flags |= __GFP_HIGHMEM;
+
+		for (r = 0; r < count; ++r) {
+			p = alloc_page(gfp_flags);
+			if (!p) {
+
+				printk(KERN_ERR "[ttm] unable to allocate page.");
+				return -ENOMEM;
+			}
+
+			list_add(&p->lru, pages);
+		}
+		return 0;
+	}
+
+
+	/* combine zero flag to pool flags */
+	gfp_flags |= pool->gfp_flags;
+
+	/* First we take pages from the pool */
+	count = ttm_page_pool_get_pages(pool, pages, flags, cstate, count);
+
+	/* clear the pages coming from the pool if requested */
+	if (flags & TTM_PAGE_FLAG_ZERO_ALLOC) {
+		list_for_each_entry(p, pages, lru) {
+			clear_page(page_address(p));
+		}
+	}
+
+	/* If pool didn't have enough pages allocate new one. */
+	if (count > 0) {
+		/* ttm_alloc_new_pages doesn't reference pool so we can run
+		 * multiple requests in parallel.
+		 **/
+		r = ttm_alloc_new_pages(pages, gfp_flags, flags, cstate, count);
+		if (r) {
+			/* If there is any pages in the list put them back to
+			 * the pool. */
+			printk(KERN_ERR "[ttm] Failed to allocate extra pages "
+					"for large request.");
+			ttm_put_pages(pages, flags, cstate);
+			return r;
+		}
+	}
+
+
+	return 0;
+}
+
+/* Put all pages in pages list to correct pool to wait for reuse */
+void ttm_put_pages(struct list_head *pages, int flags,
+		enum ttm_caching_state cstate)
+{
+	unsigned long irq_flags;
+	struct ttm_page_pool *pool = ttm_get_pool(flags, cstate);
+	struct page *p, *tmp;
+	unsigned page_count = 0;
+	bool work_for_shrink;
+
+	if (pool == NULL) {
+
+		/* No pool for this memory type so free the pages */
+
+		list_for_each_entry_safe(p, tmp, pages, lru) {
+			__free_page(p);
+		}
+		/* Make the pages list empty */
+		INIT_LIST_HEAD(pages);
+		return;
+	}
+
+	list_for_each_entry_safe(p, tmp, pages, lru) {
+
+#ifdef CONFIG_HIGHMEM
+		/* we don't have pool for highmem -> free them */
+		if (PageHighMem(p)) {
+			list_del(&p->lru);
+			__free_page(p);
+		} else
+#endif
+		{
+			++page_count;
+		}
+
+	}
+
+	spin_lock_irqsave(&pool->lock, irq_flags);
+	list_splice_init(pages, &pool->list);
+	pool->npages += page_count;
+	work_for_shrink = pool->npages > pool->alloc_size;
+	spin_unlock_irqrestore(&pool->lock, irq_flags);
+
+	if (work_for_shrink)
+		(void)queue_delayed_work(_manager.glob->swap_queue,
+				&_manager.work,
+				round_jiffies(_manager.free_interval));
+}
+
+static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, int flags)
+{
+	spin_lock_init(&pool->lock);
+	pool->fill_lock = false;
+	INIT_LIST_HEAD(&pool->list);
+	pool->npages = pool->nlowpages = 0;
+	pool->alloc_size = NUM_PAGES_TO_ALLOC;
+	pool->gfp_flags = flags;
+}
+
+int ttm_page_alloc_init(struct ttm_mem_global *glob)
+{
+	if (atomic_add_return(1, &_manager.page_alloc_inited) > 1)
+		return 0;
+
+	printk(KERN_INFO "[ttm] Initializing pool allocator.\n");
+
+	ttm_page_pool_init_locked(&_manager.wc_pool, GFP_HIGHUSER);
+
+	ttm_page_pool_init_locked(&_manager.uc_pool, GFP_HIGHUSER);
+
+	ttm_page_pool_init_locked(&_manager.wc_pool_dma32, GFP_USER | GFP_DMA32);
+
+	ttm_page_pool_init_locked(&_manager.uc_pool_dma32, GFP_USER | GFP_DMA32);
+
+	_manager.free_interval = msecs_to_jiffies(PAGE_FREE_INTERVAL);
+	_manager.small_allocation = SMALL_ALLOCATION;
+	_manager.glob = glob;
+
+	INIT_DELAYED_WORK(&_manager.work, ttm_pool_shrink);
+
+	return 0;
+}
+
+void ttm_page_alloc_fini()
+{
+	int i;
+
+	if (atomic_sub_return(1, &_manager.page_alloc_inited) > 0)
+		return;
+
+	printk(KERN_INFO "[ttm] Finilizing pool allocator.\n");
+
+	/* stop the shrinker from running */
+	cancel_delayed_work_sync(&_manager.work);
+
+	for (i = 0; i < NUM_POOLS; ++i)
+		ttm_page_pool_free(&_manager.pools[i], FREE_ALL_PAGES);
+}
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index a759170..8a6fc01 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -38,6 +38,7 @@
 #include "ttm/ttm_module.h"
 #include "ttm/ttm_bo_driver.h"
 #include "ttm/ttm_placement.h"
+#include "ttm/ttm_page_alloc.h"
 
 static int ttm_tt_swapin(struct ttm_tt *ttm);
 
@@ -72,21 +73,6 @@ static void ttm_tt_free_page_directory(struct ttm_tt *ttm)
 	ttm->pages = NULL;
 }
 
-static struct page *ttm_tt_alloc_page(unsigned page_flags)
-{
-	gfp_t gfp_flags = GFP_USER;
-
-	if (page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)
-		gfp_flags |= __GFP_ZERO;
-
-	if (page_flags & TTM_PAGE_FLAG_DMA32)
-		gfp_flags |= __GFP_DMA32;
-	else
-		gfp_flags |= __GFP_HIGHMEM;
-
-	return alloc_page(gfp_flags);
-}
-
 static void ttm_tt_free_user_pages(struct ttm_tt *ttm)
 {
 	int write;
@@ -127,15 +113,21 @@ static void ttm_tt_free_user_pages(struct ttm_tt *ttm)
 static struct page *__ttm_tt_get_page(struct ttm_tt *ttm, int index)
 {
 	struct page *p;
+	struct list_head h;
 	struct ttm_mem_global *mem_glob = ttm->glob->mem_glob;
 	int ret;
 
 	while (NULL == (p = ttm->pages[index])) {
-		p = ttm_tt_alloc_page(ttm->page_flags);
 
-		if (!p)
+		INIT_LIST_HEAD(&h);
+
+		ret = ttm_get_pages(&h, ttm->page_flags, ttm->caching_state, 1);
+
+		if (ret != 0)
 			return NULL;
 
+		p = list_first_entry(&h, struct page, lru);
+
 		ret = ttm_mem_global_alloc_page(mem_glob, p, false, false);
 		if (unlikely(ret != 0))
 			goto out_err;
@@ -244,10 +236,10 @@ static int ttm_tt_set_caching(struct ttm_tt *ttm,
 	if (ttm->caching_state == c_state)
 		return 0;
 
-	if (c_state != tt_cached) {
-		ret = ttm_tt_populate(ttm);
-		if (unlikely(ret != 0))
-			return ret;
+	if (ttm->state == tt_unpopulated) {
+		/* Change caching but don't populate */
+		ttm->caching_state = c_state;
+		return 0;
 	}
 
 	if (ttm->caching_state == tt_cached)
@@ -298,13 +290,17 @@ EXPORT_SYMBOL(ttm_tt_set_placement_caching);
 static void ttm_tt_free_alloced_pages(struct ttm_tt *ttm)
 {
 	int i;
+	unsigned count = 0;
+	struct list_head h;
 	struct page *cur_page;
 	struct ttm_backend *be = ttm->be;
 
+	INIT_LIST_HEAD(&h);
+
 	if (be)
 		be->func->clear(be);
-	(void)ttm_tt_set_caching(ttm, tt_cached);
 	for (i = 0; i < ttm->num_pages; ++i) {
+
 		cur_page = ttm->pages[i];
 		ttm->pages[i] = NULL;
 		if (cur_page) {
@@ -314,9 +310,11 @@ static void ttm_tt_free_alloced_pages(struct ttm_tt *ttm)
 				       "Leaking pages.\n");
 			ttm_mem_global_free_page(ttm->glob->mem_glob,
 						 cur_page);
-			__free_page(cur_page);
+			list_add(&cur_page->lru, &h);
+			count++;
 		}
 	}
+	ttm_put_pages(&h, ttm->page_flags, ttm->caching_state);
 	ttm->state = tt_unpopulated;
 	ttm->first_himem_page = ttm->num_pages;
 	ttm->last_lomem_page = -1;
diff --git a/include/drm/ttm/ttm_page_alloc.h b/include/drm/ttm/ttm_page_alloc.h
new file mode 100644
index 0000000..485514a
--- /dev/null
+++ b/include/drm/ttm/ttm_page_alloc.h
@@ -0,0 +1,64 @@
+/*
+ * Copyright (c) Red Hat Inc.
+
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sub license,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Dave Airlie <airlied@redhat.com>
+ *          Jerome Glisse <jglisse@redhat.com>
+ */
+#ifndef TTM_PAGE_ALLOC
+#define TTM_PAGE_ALLOC
+
+#include "ttm_bo_driver.h"
+#include "ttm_memory.h"
+
+/**
+ * Get count number of pages from pool to pages list.
+ *
+ * @pages: heado of empty linked list where pages are filled.
+ * @flags: ttm flags for page allocation.
+ * @cstate: ttm caching state for the page.
+ * @count: number of pages to allocate.
+ */
+int ttm_get_pages(struct list_head *pages, int flags,
+		enum ttm_caching_state cstate, unsigned count);
+/**
+ * Put linked list of pages to pool.
+ *
+ * @pages: list of pages to free.
+ * @flags: ttm flags for page allocation.
+ * @cstate: ttm caching state.
+ */
+void ttm_put_pages(struct list_head *pages, int flags,
+		enum ttm_caching_state cstate);
+/**
+ * Initialize pool allocator.
+ *
+ * Pool allocator is internaly reference counted so it can be initialized
+ * multiple times but ttm_page_alloc_fini has to be called same number of
+ * times.
+ */
+int ttm_page_alloc_init(struct ttm_mem_global *glob);
+/**
+ * Free pool allocator.
+ */
+void ttm_page_alloc_fini(void);
+
+#endif