diff mbox

[RFC,1/4] drm/i915: Implement a framework for batch buffer pools

Message ID 1403109376-23452-2-git-send-email-bradley.d.volkin@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

bradley.d.volkin@intel.com June 18, 2014, 4:36 p.m. UTC
From: Brad Volkin <bradley.d.volkin@intel.com>

This adds a small module for managing a pool of batch buffers.
The only current use case is for the command parser, as described
in the kerneldoc in the patch. The code is simple, but separating
it out makes it easier to change the underlying algorithms and to
extend to future use cases should they arise.

The interface is simple: alloc to create an empty pool, free to
clean it up; get to obtain a new buffer, put to return it to the
pool. Note that all buffers must be returned to the pool before
freeing it.

The pool has a maximum number of buffers allowed due to some tests
(e.g. gem_exec_nop) creating a very large number of buffers (e.g.
___). Buffers are purgeable while in the pool, but not explicitly
truncated in order to avoid overhead during execbuf.

Locking is currently based on the caller holding the struct_mutex.
We already do that in the places where we will use the batch pool
for the command parser.

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
---

r.e. pool capacity
My original testing showed something like thousands of buffers in
the pool after a gem_exec_nop run. But when I reran with the max
check disabled just now to get an actual number for the commit
message, the number was more like 130.  I developed and tested the
changes incrementally, and suspect that the original run was before
I implemented the actual copy operation. So I'm inclined to remove
or at least increase the cap in the final version. Thoughts?

---
 Documentation/DocBook/drm.tmpl             |   5 +
 drivers/gpu/drm/i915/Makefile              |   1 +
 drivers/gpu/drm/i915/i915_drv.h            |  19 ++++
 drivers/gpu/drm/i915/i915_gem.c            |   1 +
 drivers/gpu/drm/i915/i915_gem_batch_pool.c | 151 +++++++++++++++++++++++++++++
 5 files changed, 177 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c

Comments

Tvrtko Ursulin June 19, 2014, 9:48 a.m. UTC | #1
Hi Brad,

On 06/18/2014 05:36 PM, bradley.d.volkin@intel.com wrote:
> From: Brad Volkin <bradley.d.volkin@intel.com>
>
> This adds a small module for managing a pool of batch buffers.
> The only current use case is for the command parser, as described
> in the kerneldoc in the patch. The code is simple, but separating
> it out makes it easier to change the underlying algorithms and to
> extend to future use cases should they arise.
>
> The interface is simple: alloc to create an empty pool, free to
> clean it up; get to obtain a new buffer, put to return it to the
> pool. Note that all buffers must be returned to the pool before
> freeing it.
>
> The pool has a maximum number of buffers allowed due to some tests
> (e.g. gem_exec_nop) creating a very large number of buffers (e.g.
> ___). Buffers are purgeable while in the pool, but not explicitly
> truncated in order to avoid overhead during execbuf.
>
> Locking is currently based on the caller holding the struct_mutex.
> We already do that in the places where we will use the batch pool
> for the command parser.
>
> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> ---
>
> r.e. pool capacity
> My original testing showed something like thousands of buffers in
> the pool after a gem_exec_nop run. But when I reran with the max
> check disabled just now to get an actual number for the commit
> message, the number was more like 130.  I developed and tested the
> changes incrementally, and suspect that the original run was before
> I implemented the actual copy operation. So I'm inclined to remove
> or at least increase the cap in the final version. Thoughts?

Some random thoughts:

Is it strictly necessary to cap the pool size? I ask because it seems to 
be introducing a limit where so far there wasn't an explicit one.

Are object sizes generally page aligned or all you've seen all sizes in 
the distribution? Either way, I am thinking whether some sort of 
rounding up would be more efficient? Would it cause a problem if 
slightly larger object was returned?

Given that objects are only ever added to the pool, once max number is 
allocated and there are no free ones of exact size it nags userspace 
with EAGAIN and retires objects. But I wonder if the above points could 
reduce that behaviour?

Could we get away without tracking the given out objects in a list and 
just keep a list of available ones? In which case if object can only 
ever be either in the free pool or on one of the existing GEM 
active/inactive lists the same list head could be used?

Could it use its own locking just as easily? Just thinking if the future 
goal is to fine grain locking, this seems self contained enough to be 
doable straight away unless I am missing something.

The above would make the pool more generic, but then I read Chris's 
reply which perhaps suggests to make it more specialised so I don't know.

Regards,

Tvrtko
bradley.d.volkin@intel.com June 19, 2014, 5:35 p.m. UTC | #2
On Thu, Jun 19, 2014 at 02:48:29AM -0700, Tvrtko Ursulin wrote:
> 
> Hi Brad,
> 
> On 06/18/2014 05:36 PM, bradley.d.volkin@intel.com wrote:
> > From: Brad Volkin <bradley.d.volkin@intel.com>
> >
> > This adds a small module for managing a pool of batch buffers.
> > The only current use case is for the command parser, as described
> > in the kerneldoc in the patch. The code is simple, but separating
> > it out makes it easier to change the underlying algorithms and to
> > extend to future use cases should they arise.
> >
> > The interface is simple: alloc to create an empty pool, free to
> > clean it up; get to obtain a new buffer, put to return it to the
> > pool. Note that all buffers must be returned to the pool before
> > freeing it.
> >
> > The pool has a maximum number of buffers allowed due to some tests
> > (e.g. gem_exec_nop) creating a very large number of buffers (e.g.
> > ___). Buffers are purgeable while in the pool, but not explicitly
> > truncated in order to avoid overhead during execbuf.
> >
> > Locking is currently based on the caller holding the struct_mutex.
> > We already do that in the places where we will use the batch pool
> > for the command parser.
> >
> > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> > ---
> >
> > r.e. pool capacity
> > My original testing showed something like thousands of buffers in
> > the pool after a gem_exec_nop run. But when I reran with the max
> > check disabled just now to get an actual number for the commit
> > message, the number was more like 130.  I developed and tested the
> > changes incrementally, and suspect that the original run was before
> > I implemented the actual copy operation. So I'm inclined to remove
> > or at least increase the cap in the final version. Thoughts?
> 
> Some random thoughts:
> 
> Is it strictly necessary to cap the pool size? I ask because it seems to 
> be introducing a limit where so far there wasn't an explicit one.

No, I only added it because there were a huge number of buffers in the
pool at one point. But that seems to have been an artifact of my
development process, so unless someone says they really want to keep
the cap, I'm going to drop it in the next rev.

> 
> Are object sizes generally page aligned or all you've seen all sizes in 
> the distribution? Either way, I am thinking whether some sort of 
> rounding up would be more efficient? Would it cause a problem if 
> slightly larger object was returned?

I believe objects must have page-aligned sizes.

> 
> Given that objects are only ever added to the pool, once max number is 
> allocated and there are no free ones of exact size it nags userspace 
> with EAGAIN and retires objects. But I wonder if the above points could 
> reduce that behaviour?
> 
> Could we get away without tracking the given out objects in a list and 
> just keep a list of available ones? In which case if object can only 
> ever be either in the free pool or on one of the existing GEM 
> active/inactive lists the same list head could be used?
> 
> Could it use its own locking just as easily? Just thinking if the future 
> goal is to fine grain locking, this seems self contained enough to be 
> doable straight away unless I am missing something.

I thought about putting a spinlock in the batch pool struct for
exactly that reason. But we have to hold the struct_mutex for at least
drm_gem_object_unreference(). I guess we don't actually need it for
i915_gem_alloc_object(). I opted to just put the mutex_is_locked()
checks so the dependency on the lock being held was hopefully easy to
find by the poor soul who eventually tries to tackle the locking
improvements :). But if people would rather I add a lock to the batch
pool struct, I can do that.

Thanks,
Brad

> 
> The above would make the pool more generic, but then I read Chris's 
> reply which perhaps suggests to make it more specialised so I don't know.
> 
> Regards,
> 
> Tvrtko
> 
> 
> 
> 
> 
> 
> 
> 
> 
>
Daniel Vetter June 19, 2014, 7:07 p.m. UTC | #3
On Thu, Jun 19, 2014 at 10:35:44AM -0700, Volkin, Bradley D wrote:
> On Thu, Jun 19, 2014 at 02:48:29AM -0700, Tvrtko Ursulin wrote:
> > 
> > Hi Brad,
> > 
> > On 06/18/2014 05:36 PM, bradley.d.volkin@intel.com wrote:
> > > From: Brad Volkin <bradley.d.volkin@intel.com>
> > >
> > > This adds a small module for managing a pool of batch buffers.
> > > The only current use case is for the command parser, as described
> > > in the kerneldoc in the patch. The code is simple, but separating
> > > it out makes it easier to change the underlying algorithms and to
> > > extend to future use cases should they arise.
> > >
> > > The interface is simple: alloc to create an empty pool, free to
> > > clean it up; get to obtain a new buffer, put to return it to the
> > > pool. Note that all buffers must be returned to the pool before
> > > freeing it.
> > >
> > > The pool has a maximum number of buffers allowed due to some tests
> > > (e.g. gem_exec_nop) creating a very large number of buffers (e.g.
> > > ___). Buffers are purgeable while in the pool, but not explicitly
> > > truncated in order to avoid overhead during execbuf.
> > >
> > > Locking is currently based on the caller holding the struct_mutex.
> > > We already do that in the places where we will use the batch pool
> > > for the command parser.
> > >
> > > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> > > ---
> > >
> > > r.e. pool capacity
> > > My original testing showed something like thousands of buffers in
> > > the pool after a gem_exec_nop run. But when I reran with the max
> > > check disabled just now to get an actual number for the commit
> > > message, the number was more like 130.  I developed and tested the
> > > changes incrementally, and suspect that the original run was before
> > > I implemented the actual copy operation. So I'm inclined to remove
> > > or at least increase the cap in the final version. Thoughts?
> > 
> > Some random thoughts:
> > 
> > Is it strictly necessary to cap the pool size? I ask because it seems to 
> > be introducing a limit where so far there wasn't an explicit one.
> 
> No, I only added it because there were a huge number of buffers in the
> pool at one point. But that seems to have been an artifact of my
> development process, so unless someone says they really want to keep
> the cap, I'm going to drop it in the next rev.

As long as we have a single global lru and mark buffers correctly as
purgeable the shrinker logic will size your working set optimally. So I
don't think a cap is necessary as long as you reuse eagerly.

> > Are object sizes generally page aligned or all you've seen all sizes in 
> > the distribution? Either way, I am thinking whether some sort of 
> > rounding up would be more efficient? Would it cause a problem if 
> > slightly larger object was returned?
> 
> I believe objects must have page-aligned sizes.

Yes.

> > Given that objects are only ever added to the pool, once max number is 
> > allocated and there are no free ones of exact size it nags userspace 
> > with EAGAIN and retires objects. But I wonder if the above points could 
> > reduce that behaviour?
> > 
> > Could we get away without tracking the given out objects in a list and 
> > just keep a list of available ones? In which case if object can only 
> > ever be either in the free pool or on one of the existing GEM 
> > active/inactive lists the same list head could be used?
> > 
> > Could it use its own locking just as easily? Just thinking if the future 
> > goal is to fine grain locking, this seems self contained enough to be 
> > doable straight away unless I am missing something.
> 
> I thought about putting a spinlock in the batch pool struct for
> exactly that reason. But we have to hold the struct_mutex for at least
> drm_gem_object_unreference(). I guess we don't actually need it for
> i915_gem_alloc_object(). I opted to just put the mutex_is_locked()
> checks so the dependency on the lock being held was hopefully easy to
> find by the poor soul who eventually tries to tackle the locking
> improvements :). But if people would rather I add a lock to the batch
> pool struct, I can do that.

If we ever get around to more fine-grained locking I expect three classes
of locks: per-gem-object locks, one lru lock for all these lists and a
low-level lock for actually submitting batches to the hw (i.e. tail/ring
updates and execlist submission). But before we have that adding
fine-grained locks that always must nest within dev->struct_mutex will
only make the eventual conversion harder.
-Daniel
Tvrtko Ursulin June 20, 2014, 1:25 p.m. UTC | #4
On 06/19/2014 06:35 PM, Volkin, Bradley D wrote:
> On Thu, Jun 19, 2014 at 02:48:29AM -0700, Tvrtko Ursulin wrote:
>>
>> Hi Brad,
>>
>> On 06/18/2014 05:36 PM, bradley.d.volkin@intel.com wrote:
>>> From: Brad Volkin <bradley.d.volkin@intel.com>
>>>
>>> This adds a small module for managing a pool of batch buffers.
>>> The only current use case is for the command parser, as described
>>> in the kerneldoc in the patch. The code is simple, but separating
>>> it out makes it easier to change the underlying algorithms and to
>>> extend to future use cases should they arise.
>>>
>>> The interface is simple: alloc to create an empty pool, free to
>>> clean it up; get to obtain a new buffer, put to return it to the
>>> pool. Note that all buffers must be returned to the pool before
>>> freeing it.
>>>
>>> The pool has a maximum number of buffers allowed due to some tests
>>> (e.g. gem_exec_nop) creating a very large number of buffers (e.g.
>>> ___). Buffers are purgeable while in the pool, but not explicitly
>>> truncated in order to avoid overhead during execbuf.
>>>
>>> Locking is currently based on the caller holding the struct_mutex.
>>> We already do that in the places where we will use the batch pool
>>> for the command parser.
>>>
>>> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
>>> ---
>>>
>>> r.e. pool capacity
>>> My original testing showed something like thousands of buffers in
>>> the pool after a gem_exec_nop run. But when I reran with the max
>>> check disabled just now to get an actual number for the commit
>>> message, the number was more like 130.  I developed and tested the
>>> changes incrementally, and suspect that the original run was before
>>> I implemented the actual copy operation. So I'm inclined to remove
>>> or at least increase the cap in the final version. Thoughts?
>>
>> Some random thoughts:
>>
>> Is it strictly necessary to cap the pool size? I ask because it seems to
>> be introducing a limit where so far there wasn't an explicit one.
>
> No, I only added it because there were a huge number of buffers in the
> pool at one point. But that seems to have been an artifact of my
> development process, so unless someone says they really want to keep
> the cap, I'm going to drop it in the next rev.

Cap or no cap (I am for no cap), but the pool is still "grow only" at 
the moment, no? So one allocation storm and objects on the pool inactive 
list end up wasting memory forever.

Unless my novice eyes are missing something hidden? But it can't be 
since then there would have to be a mechanism letting the pool know that 
some objects got expired.

Regards,

Tvrtko
bradley.d.volkin@intel.com June 20, 2014, 3:30 p.m. UTC | #5
On Fri, Jun 20, 2014 at 06:25:56AM -0700, Tvrtko Ursulin wrote:
> 
> On 06/19/2014 06:35 PM, Volkin, Bradley D wrote:
> > On Thu, Jun 19, 2014 at 02:48:29AM -0700, Tvrtko Ursulin wrote:
> >>
> >> Hi Brad,
> >>
> >> On 06/18/2014 05:36 PM, bradley.d.volkin@intel.com wrote:
> >>> From: Brad Volkin <bradley.d.volkin@intel.com>
> >>>
> >>> This adds a small module for managing a pool of batch buffers.
> >>> The only current use case is for the command parser, as described
> >>> in the kerneldoc in the patch. The code is simple, but separating
> >>> it out makes it easier to change the underlying algorithms and to
> >>> extend to future use cases should they arise.
> >>>
> >>> The interface is simple: alloc to create an empty pool, free to
> >>> clean it up; get to obtain a new buffer, put to return it to the
> >>> pool. Note that all buffers must be returned to the pool before
> >>> freeing it.
> >>>
> >>> The pool has a maximum number of buffers allowed due to some tests
> >>> (e.g. gem_exec_nop) creating a very large number of buffers (e.g.
> >>> ___). Buffers are purgeable while in the pool, but not explicitly
> >>> truncated in order to avoid overhead during execbuf.
> >>>
> >>> Locking is currently based on the caller holding the struct_mutex.
> >>> We already do that in the places where we will use the batch pool
> >>> for the command parser.
> >>>
> >>> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
> >>> ---
> >>>
> >>> r.e. pool capacity
> >>> My original testing showed something like thousands of buffers in
> >>> the pool after a gem_exec_nop run. But when I reran with the max
> >>> check disabled just now to get an actual number for the commit
> >>> message, the number was more like 130.  I developed and tested the
> >>> changes incrementally, and suspect that the original run was before
> >>> I implemented the actual copy operation. So I'm inclined to remove
> >>> or at least increase the cap in the final version. Thoughts?
> >>
> >> Some random thoughts:
> >>
> >> Is it strictly necessary to cap the pool size? I ask because it seems to
> >> be introducing a limit where so far there wasn't an explicit one.
> >
> > No, I only added it because there were a huge number of buffers in the
> > pool at one point. But that seems to have been an artifact of my
> > development process, so unless someone says they really want to keep
> > the cap, I'm going to drop it in the next rev.
> 
> Cap or no cap (I am for no cap), but the pool is still "grow only" at 
> the moment, no? So one allocation storm and objects on the pool inactive 
> list end up wasting memory forever.

Oh, so what happens is that when you put() an object back in the pool, we
set obj->madv = I915_MADV_DONTNEED, which should tell the shrinker that it
can drop the backing storage for the object if we need space. When you get()
an object, we set obj->madv = I915_MADV_WILLNEED and get new backing pages.
So the number of objects grows (capped or not), but the memory used can be
controlled.

Brad

> 
> Unless my novice eyes are missing something hidden? But it can't be 
> since then there would have to be a mechanism letting the pool know that 
> some objects got expired.
> 
> Regards,
> 
> Tvrtko
> 
>
Tvrtko Ursulin June 20, 2014, 3:41 p.m. UTC | #6
On 06/20/2014 04:30 PM, Volkin, Bradley D wrote:
> On Fri, Jun 20, 2014 at 06:25:56AM -0700, Tvrtko Ursulin wrote:
>>
>> On 06/19/2014 06:35 PM, Volkin, Bradley D wrote:
>>> On Thu, Jun 19, 2014 at 02:48:29AM -0700, Tvrtko Ursulin wrote:
>>>>
>>>> Hi Brad,
>>>>
>>>> On 06/18/2014 05:36 PM, bradley.d.volkin@intel.com wrote:
>>>>> From: Brad Volkin <bradley.d.volkin@intel.com>
>>>>>
>>>>> This adds a small module for managing a pool of batch buffers.
>>>>> The only current use case is for the command parser, as described
>>>>> in the kerneldoc in the patch. The code is simple, but separating
>>>>> it out makes it easier to change the underlying algorithms and to
>>>>> extend to future use cases should they arise.
>>>>>
>>>>> The interface is simple: alloc to create an empty pool, free to
>>>>> clean it up; get to obtain a new buffer, put to return it to the
>>>>> pool. Note that all buffers must be returned to the pool before
>>>>> freeing it.
>>>>>
>>>>> The pool has a maximum number of buffers allowed due to some tests
>>>>> (e.g. gem_exec_nop) creating a very large number of buffers (e.g.
>>>>> ___). Buffers are purgeable while in the pool, but not explicitly
>>>>> truncated in order to avoid overhead during execbuf.
>>>>>
>>>>> Locking is currently based on the caller holding the struct_mutex.
>>>>> We already do that in the places where we will use the batch pool
>>>>> for the command parser.
>>>>>
>>>>> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
>>>>> ---
>>>>>
>>>>> r.e. pool capacity
>>>>> My original testing showed something like thousands of buffers in
>>>>> the pool after a gem_exec_nop run. But when I reran with the max
>>>>> check disabled just now to get an actual number for the commit
>>>>> message, the number was more like 130.  I developed and tested the
>>>>> changes incrementally, and suspect that the original run was before
>>>>> I implemented the actual copy operation. So I'm inclined to remove
>>>>> or at least increase the cap in the final version. Thoughts?
>>>>
>>>> Some random thoughts:
>>>>
>>>> Is it strictly necessary to cap the pool size? I ask because it seems to
>>>> be introducing a limit where so far there wasn't an explicit one.
>>>
>>> No, I only added it because there were a huge number of buffers in the
>>> pool at one point. But that seems to have been an artifact of my
>>> development process, so unless someone says they really want to keep
>>> the cap, I'm going to drop it in the next rev.
>>
>> Cap or no cap (I am for no cap), but the pool is still "grow only" at
>> the moment, no? So one allocation storm and objects on the pool inactive
>> list end up wasting memory forever.
>
> Oh, so what happens is that when you put() an object back in the pool, we
> set obj->madv = I915_MADV_DONTNEED, which should tell the shrinker that it
> can drop the backing storage for the object if we need space. When you get()
> an object, we set obj->madv = I915_MADV_WILLNEED and get new backing pages.
> So the number of objects grows (capped or not), but the memory used can be
> controlled.

Education time for me I see. :)

So the object is in pool->inactive_list _and_ in some other list so 
shrinker can find it?

Thanks,

Tvrtko
bradley.d.volkin@intel.com June 20, 2014, 4:06 p.m. UTC | #7
On Fri, Jun 20, 2014 at 08:41:08AM -0700, Tvrtko Ursulin wrote:
> 
> On 06/20/2014 04:30 PM, Volkin, Bradley D wrote:
> > On Fri, Jun 20, 2014 at 06:25:56AM -0700, Tvrtko Ursulin wrote:
> >>
> >> On 06/19/2014 06:35 PM, Volkin, Bradley D wrote:
> >>> On Thu, Jun 19, 2014 at 02:48:29AM -0700, Tvrtko Ursulin wrote:
> >>>>
> >>>> Hi Brad,
> >>>>
> >>>> On 06/18/2014 05:36 PM, bradley.d.volkin@intel.com wrote:
> >> Cap or no cap (I am for no cap), but the pool is still "grow only" at
> >> the moment, no? So one allocation storm and objects on the pool inactive
> >> list end up wasting memory forever.
> >
> > Oh, so what happens is that when you put() an object back in the pool, we
> > set obj->madv = I915_MADV_DONTNEED, which should tell the shrinker that it
> > can drop the backing storage for the object if we need space. When you get()
> > an object, we set obj->madv = I915_MADV_WILLNEED and get new backing pages.
> > So the number of objects grows (capped or not), but the memory used can be
> > controlled.
> 
> Education time for me I see. :)
> 
> So the object is in pool->inactive_list _and_ in some other list so 
> shrinker can find it?

Yes. In fact, they are on several other lists. Here's my understanding:

The dev_priv->mm struct has bound_list and unbound_list, which track which
objects are or are not bound in some gtt. The shrinker operates on these
lists to drop backing storage when we need physical space. The pool objects
are added to these lists when we explicitly call ggtt_pin() and  when the
object eventually gets an unbind().

The ring structs have an active_list, which track objects that are used by
work still in progress on that ring. The i915_address_space structs have an
inactive_list, which contains vmas that are bound in that address space but
are not used by work still in progress. The driver uses these to evict
objects in a given gtt/ppgtt when we need GPU virtual address space. We
explicitly put a pool object's vma on the active_list with a move_to_active()
call at the end of do_execbuffer(). Retiring requests moves it to the
appropriate i915_address_space inactive_list.

Brad

> 
> Thanks,
> 
> Tvrtko
diff mbox

Patch

diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl
index 7df3134..fcc0a1c 100644
--- a/Documentation/DocBook/drm.tmpl
+++ b/Documentation/DocBook/drm.tmpl
@@ -3939,6 +3939,11 @@  int num_ioctls;</synopsis>
 !Pdrivers/gpu/drm/i915/i915_cmd_parser.c batch buffer command parser
 !Idrivers/gpu/drm/i915/i915_cmd_parser.c
       </sect2>
+      <sect2>
+        <title>Batchbuffer Pools</title>
+!Pdrivers/gpu/drm/i915/i915_gem_batch_pool.c batch pool
+!Idrivers/gpu/drm/i915/i915_gem_batch_pool.c
+      </sect2>
     </sect1>
   </chapter>
 </part>
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index cad1683..b92fbe6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -17,6 +17,7 @@  i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o
 
 # GEM code
 i915-y += i915_cmd_parser.o \
+	  i915_gem_batch_pool.o \
 	  i915_gem_context.o \
 	  i915_gem_render_state.o \
 	  i915_gem_debug.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0640071..2a88b5e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1610,6 +1610,8 @@  struct drm_i915_gem_object {
 	/** Used in execbuf to temporarily hold a ref */
 	struct list_head obj_exec_link;
 
+	struct list_head batch_pool_list;
+
 	/**
 	 * This is set if the object is on the active lists (has pending
 	 * rendering and so a non-zero seqno), and is not set if it i s on
@@ -1727,6 +1729,14 @@  struct drm_i915_gem_object {
 };
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
+struct i915_gem_batch_pool {
+	struct drm_device *dev;
+	struct list_head active_list;
+	struct list_head inactive_list;
+	int count;
+	int max_count;
+};
+
 /**
  * Request queue structure.
  *
@@ -2508,6 +2518,15 @@  void i915_destroy_error_state(struct drm_device *dev);
 void i915_get_extra_instdone(struct drm_device *dev, uint32_t *instdone);
 const char *i915_cache_level_str(int type);
 
+/* i915_gem_batch_pool.c */
+struct i915_gem_batch_pool *i915_gem_batch_pool_alloc(struct drm_device *dev,
+						      int max_count);
+void i915_gem_batch_pool_free(struct i915_gem_batch_pool *pool);
+struct drm_i915_gem_object*
+i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, size_t size);
+void i915_gem_batch_pool_put(struct i915_gem_batch_pool *pool,
+			     struct drm_i915_gem_object *obj);
+
 /* i915_cmd_parser.c */
 int i915_cmd_parser_get_version(void);
 int i915_cmd_parser_init_ring(struct intel_engine_cs *ring);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d857f58..d5e3001 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4324,6 +4324,7 @@  void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&obj->ring_list);
 	INIT_LIST_HEAD(&obj->obj_exec_link);
 	INIT_LIST_HEAD(&obj->vma_list);
+	INIT_LIST_HEAD(&obj->batch_pool_list);
 
 	obj->ops = ops;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
new file mode 100644
index 0000000..37cec7d
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
@@ -0,0 +1,151 @@ 
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+
+/**
+ * DOC: batch pool
+ *
+ * In order to submit batch buffers as 'secure', the software command parser
+ * must ensure that a batch buffer cannot be modified after parsing. It does
+ * this by copying the user provided batch buffer contents to a kernel owned
+ * buffer from which the hardware will actually execute, and by carefully
+ * managing the address space bindings for such buffers.
+ *
+ * The batch pool framework provides a mechanism for the driver to manage a
+ * set of scratch buffers to use for this purpose. The framework can be
+ * extended to support other uses cases should they arise.
+ */
+
+/**
+ * i915_gem_batch_pool_alloc() - allocate and initialize a batch buffer pool
+ * @dev: the drm device
+ * @max_count: the maximum number of buffers allowed in the pool
+ *
+ * Return: the new batch buffer pool
+ */
+struct i915_gem_batch_pool *i915_gem_batch_pool_alloc(struct drm_device *dev,
+						      int max_count)
+{
+	struct i915_gem_batch_pool *pool;
+
+	pool = kmalloc(sizeof(*pool), GFP_KERNEL);
+	if (!pool)
+		return ERR_PTR(-ENOMEM);
+
+	pool->dev = dev;
+	INIT_LIST_HEAD(&pool->active_list);
+	INIT_LIST_HEAD(&pool->inactive_list);
+	pool->count = 0;
+	pool->max_count = max_count;
+
+	return pool;
+}
+
+/**
+ * i915_gem_batch_pool_free() - clean up a batch buffer pool
+ * @pool: the pool to clean up
+ *
+ * Note: Callers must hold the struct_mutex. Callers must also ensure
+ *       that all buffers have been returned to the pool.
+ */
+void i915_gem_batch_pool_free(struct i915_gem_batch_pool *pool)
+{
+	struct drm_i915_gem_object *obj, *next;
+
+	BUG_ON(!mutex_is_locked(&pool->dev->struct_mutex));
+	BUG_ON(!list_empty(&pool->active_list));
+
+	list_for_each_entry_safe(obj, next,
+				 &pool->inactive_list, batch_pool_list) {
+		list_del(&obj->batch_pool_list);
+		drm_gem_object_unreference(&obj->base);
+	}
+
+	kfree(pool);
+}
+
+/**
+ * i915_gem_batch_pool_get() - select a buffer from the pool
+ * @pool: the batch buffer pool
+ * @size: the minimum desired size of the returned buffer
+ *
+ * Finds or allocates a batch buffer in the pool with at least the requested
+ * size. The buffer will not be used to satisfy further
+ * i915_gem_batch_pool_get() requests until the corresponding
+ * i915_gem_batch_pool_put() call. The caller is responsible for any domain or
+ * active/inactive management for the returned buffer.
+ *
+ * Note: Callers must hold the struct_mutex
+ *
+ * Return: the selected batch buffer object; -EAGAIN if the pool is at capacity
+ */
+struct drm_i915_gem_object *
+i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
+			size_t size)
+{
+	struct drm_i915_gem_object *obj = NULL;
+	struct drm_i915_gem_object *tmp;
+
+	BUG_ON(!mutex_is_locked(&pool->dev->struct_mutex));
+
+	list_for_each_entry(tmp, &pool->inactive_list, batch_pool_list) {
+		if (tmp->base.size >= size) {
+			obj = tmp;
+			break;
+		}
+	}
+
+	if (!obj) {
+		if (pool->count == pool->max_count)
+			return ERR_PTR(-EAGAIN);
+
+		obj = i915_gem_alloc_object(pool->dev, size);
+		if (!obj)
+			return ERR_PTR(-ENOMEM);
+
+		list_add_tail(&obj->batch_pool_list, &pool->inactive_list);
+		pool->count++;
+	}
+
+	obj->madv = I915_MADV_WILLNEED;
+	list_move_tail(&obj->batch_pool_list, &pool->active_list);
+
+	return obj;
+}
+
+/**
+ * i915_gem_batch_pool_put() - return a buffer to the pool
+ * @pool: the batch buffer pool
+ * @obj: the batch buffer object
+ *
+ * Note: Callers must hold the struct_mutex
+ */
+void i915_gem_batch_pool_put(struct i915_gem_batch_pool *pool,
+			     struct drm_i915_gem_object *obj)
+{
+	BUG_ON(!mutex_is_locked(&pool->dev->struct_mutex));
+	obj->madv = I915_MADV_DONTNEED;
+	list_move_tail(&obj->batch_pool_list, &pool->inactive_list);
+}