diff mbox

[09/50] drm/i915: Plumb the context everywhere in the execbuffer path

Message ID 1399637360-4277-10-git-send-email-oscar.mateo@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

oscar.mateo@intel.com May 9, 2014, 12:08 p.m. UTC
From: Oscar Mateo <oscar.mateo@intel.com>

The context are going to become very important pretty soon, and
we need to be able to access them in a number of places inside
the command submission path. The idea is that, when we need to
place commands inside a ringbuffer or update the tail register,
we know which context we are working with.

We left intel_ring_begin() as a function macro to quickly adapt
legacy code, an introduce intel_ringbuffer_begin() as the first
of a set of new functions for ringbuffer manipulation (the rest
will come in subsequent patches).

No functional changes.

v2: Do not set the context to NULL. In legacy code, set it to
the default ring context (even if it doesn't get used later on).

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c            |  2 +-
 drivers/gpu/drm/i915/i915_drv.h            |  3 +-
 drivers/gpu/drm/i915/i915_gem.c            |  5 +-
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 23 +++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  9 ++--
 drivers/gpu/drm/i915/intel_display.c       | 10 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 80 +++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h    | 23 ++++++---
 9 files changed, 100 insertions(+), 57 deletions(-)

Comments

Chris Wilson May 16, 2014, 11:04 a.m. UTC | #1
On Fri, May 09, 2014 at 01:08:39PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
> 
> The context are going to become very important pretty soon, and
> we need to be able to access them in a number of places inside
> the command submission path. The idea is that, when we need to
> place commands inside a ringbuffer or update the tail register,
> we know which context we are working with.
> 
> We left intel_ring_begin() as a function macro to quickly adapt
> legacy code, an introduce intel_ringbuffer_begin() as the first
> of a set of new functions for ringbuffer manipulation (the rest
> will come in subsequent patches).
> 
> No functional changes.
> 
> v2: Do not set the context to NULL. In legacy code, set it to
> the default ring context (even if it doesn't get used later on).

Won't rings be stored within the context? So the context should be
derivable from which ring the operation is being issued on.
-Chris
oscar.mateo@intel.com May 16, 2014, 11:11 a.m. UTC | #2
> -----Original Message-----
> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> Sent: Friday, May 16, 2014 12:05 PM
> To: Mateo Lozano, Oscar
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 09/50] drm/i915: Plumb the context everywhere
> in the execbuffer path
> 
> On Fri, May 09, 2014 at 01:08:39PM +0100, oscar.mateo@intel.com wrote:
> > From: Oscar Mateo <oscar.mateo@intel.com>
> >
> > The context are going to become very important pretty soon, and we
> > need to be able to access them in a number of places inside the
> > command submission path. The idea is that, when we need to place
> > commands inside a ringbuffer or update the tail register, we know
> > which context we are working with.
> >
> > We left intel_ring_begin() as a function macro to quickly adapt legacy
> > code, an introduce intel_ringbuffer_begin() as the first of a set of
> > new functions for ringbuffer manipulation (the rest will come in
> > subsequent patches).
> >
> > No functional changes.
> >
> > v2: Do not set the context to NULL. In legacy code, set it to the
> > default ring context (even if it doesn't get used later on).
> 
> Won't rings be stored within the context? So the context should be derivable
> from which ring the operation is being issued on.
> -Chris

Rings (as in "engine command streamer") still remain in dev_priv and there are only four/five of them. What we store in the context is the new ringbuffer structure (which stores the head, tail, etc...) and the ringbuffer backing object. Knowing only the ring is not enough to derive the context.

- Oscar
Chris Wilson May 16, 2014, 11:31 a.m. UTC | #3
On Fri, May 16, 2014 at 11:11:38AM +0000, Mateo Lozano, Oscar wrote:
> > -----Original Message-----
> > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> > Sent: Friday, May 16, 2014 12:05 PM
> > To: Mateo Lozano, Oscar
> > Cc: intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH 09/50] drm/i915: Plumb the context everywhere
> > in the execbuffer path
> > 
> > On Fri, May 09, 2014 at 01:08:39PM +0100, oscar.mateo@intel.com wrote:
> > > From: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > The context are going to become very important pretty soon, and we
> > > need to be able to access them in a number of places inside the
> > > command submission path. The idea is that, when we need to place
> > > commands inside a ringbuffer or update the tail register, we know
> > > which context we are working with.
> > >
> > > We left intel_ring_begin() as a function macro to quickly adapt legacy
> > > code, an introduce intel_ringbuffer_begin() as the first of a set of
> > > new functions for ringbuffer manipulation (the rest will come in
> > > subsequent patches).
> > >
> > > No functional changes.
> > >
> > > v2: Do not set the context to NULL. In legacy code, set it to the
> > > default ring context (even if it doesn't get used later on).
> > 
> > Won't rings be stored within the context? So the context should be derivable
> > from which ring the operation is being issued on.
> > -Chris
> 
> Rings (as in "engine command streamer") still remain in dev_priv and there are only four/five of them. What we store in the context is the new ringbuffer structure (which stores the head, tail, etc...) and the ringbuffer backing object. Knowing only the ring is not enough to derive the context.

Ugh, I thought an earlier restructuring request was that the logical
ring interface was context specific.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index eb3ce6d..a582a64 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -56,7 +56,7 @@ 
 	intel_ring_emit(LP_RING(dev_priv), x)
 
 #define ADVANCE_LP_RING() \
-	__intel_ring_advance(LP_RING(dev_priv))
+	__intel_ring_advance(LP_RING(dev_priv), LP_RING(dev_priv)->default_context)
 
 /**
  * Lock test for when it's just for synchronization of ring access.
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ee27ce8..35b2ae4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2270,11 +2270,12 @@  void i915_gem_cleanup_ring(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 int __i915_add_request(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *batch_obj,
 		       u32 *seqno);
 #define i915_add_request(ring, seqno) \
-	__i915_add_request(ring, NULL, NULL, seqno)
+	__i915_add_request(ring, ring->default_context, NULL, NULL, seqno)
 int __must_check i915_wait_seqno(struct intel_engine *ring,
 				 uint32_t seqno);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e7565d9..774151c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2171,6 +2171,7 @@  i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
 }
 
 int __i915_add_request(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       struct drm_file *file,
 		       struct drm_i915_gem_object *obj,
 		       u32 *out_seqno)
@@ -2188,7 +2189,7 @@  int __i915_add_request(struct intel_engine *ring,
 	 * is that the flush _must_ happen before the next request, no matter
 	 * what.
 	 */
-	ret = intel_ring_flush_all_caches(ring);
+	ret = intel_ring_flush_all_caches(ring, ctx);
 	if (ret)
 		return ret;
 
@@ -2203,7 +2204,7 @@  int __i915_add_request(struct intel_engine *ring,
 	 */
 	request_ring_position = intel_ring_get_tail(ring);
 
-	ret = ring->add_request(ring);
+	ret = ring->add_request(ring, ctx);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 4d37e20..50337ae 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -558,7 +558,7 @@  mi_set_context(struct intel_engine *ring,
 	 * itlb_before_ctx_switch.
 	 */
 	if (IS_GEN6(ring->dev)) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, 0);
+		ret = ring->flush(ring, new_context, I915_GEM_GPU_DOMAINS, 0);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 95e797e..c93941d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -828,6 +828,7 @@  err:
 
 static int
 i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
+				struct i915_hw_context *ctx,
 				struct list_head *vmas)
 {
 	struct i915_vma *vma;
@@ -856,7 +857,7 @@  i915_gem_execbuffer_move_to_gpu(struct intel_engine *ring,
 	/* Unconditionally invalidate gpu caches and ensure that we do flush
 	 * any residual writes from the previous batch.
 	 */
-	return intel_ring_invalidate_all_caches(ring);
+	return intel_ring_invalidate_all_caches(ring, ctx);
 }
 
 static bool
@@ -971,18 +972,20 @@  static void
 i915_gem_execbuffer_retire_commands(struct drm_device *dev,
 				    struct drm_file *file,
 				    struct intel_engine *ring,
+				    struct i915_hw_context *ctx,
 				    struct drm_i915_gem_object *obj)
 {
 	/* Unconditionally force add_request to emit a full flush. */
 	ring->gpu_caches_dirty = true;
 
 	/* Add a breadcrumb for the completion of the batch buffer */
-	(void)__i915_add_request(ring, file, obj, NULL);
+	(void)__i915_add_request(ring, ctx, file, obj, NULL);
 }
 
 static int
 i915_reset_gen7_sol_offsets(struct drm_device *dev,
-			    struct intel_engine *ring)
+			    struct intel_engine *ring,
+			    struct i915_hw_context *ctx)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret, i;
@@ -992,7 +995,7 @@  i915_reset_gen7_sol_offsets(struct drm_device *dev,
 		return -EINVAL;
 	}
 
-	ret = intel_ring_begin(ring, 4 * 3);
+	ret = intel_ringbuffer_begin(ring, ctx, 4 * 3);
 	if (ret)
 		return ret;
 
@@ -1277,7 +1280,7 @@  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	else
 		exec_start += i915_gem_obj_offset(batch_obj, vm);
 
-	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
+	ret = i915_gem_execbuffer_move_to_gpu(ring, ctx, &eb->vmas);
 	if (ret)
 		goto err;
 
@@ -1287,7 +1290,7 @@  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	if (ring == &dev_priv->ring[RCS] &&
 	    mode != dev_priv->relative_constants_mode) {
-		ret = intel_ring_begin(ring, 4);
+		ret = intel_ringbuffer_begin(ring, ctx, 4);
 		if (ret)
 				goto err;
 
@@ -1301,7 +1304,7 @@  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	}
 
 	if (args->flags & I915_EXEC_GEN7_SOL_RESET) {
-		ret = i915_reset_gen7_sol_offsets(dev, ring);
+		ret = i915_reset_gen7_sol_offsets(dev, ring, ctx);
 		if (ret)
 			goto err;
 	}
@@ -1315,14 +1318,14 @@  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			if (ret)
 				goto err;
 
-			ret = ring->dispatch_execbuffer(ring,
+			ret = ring->dispatch_execbuffer(ring, ctx,
 							exec_start, exec_len,
 							flags);
 			if (ret)
 				goto err;
 		}
 	} else {
-		ret = ring->dispatch_execbuffer(ring,
+		ret = ring->dispatch_execbuffer(ring, ctx,
 						exec_start, exec_len,
 						flags);
 		if (ret)
@@ -1332,7 +1335,7 @@  i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
 	i915_gem_execbuffer_move_to_active(&eb->vmas, ring);
-	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
+	i915_gem_execbuffer_retire_commands(dev, file, ring, ctx, batch_obj);
 
 err:
 	/* the request owns the ref now */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 31b58ee..a0993c0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -740,7 +740,8 @@  static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	}
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, ring->default_context,
+			I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -784,7 +785,8 @@  static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	}
 
 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, ring->default_context,
+			I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -802,7 +804,8 @@  static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 
 	/* XXX: RCS is the only one to auto invalidate the TLBs? */
 	if (ring->id != RCS) {
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+		ret = ring->flush(ring, ring->default_context,
+				I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f821147..d7c6ce5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8609,7 +8609,7 @@  static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8651,7 +8651,7 @@  static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8700,7 +8700,7 @@  static int intel_gen4_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8745,7 +8745,7 @@  static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
@@ -8855,7 +8855,7 @@  static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ring->default_context);
 	return 0;
 
 err_unpin:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6d14dcb..3b43070 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -56,18 +56,21 @@  static bool intel_ring_stopped(struct intel_engine *ring)
 	return dev_priv->gpu_error.stop_rings & intel_ring_flag(ring);
 }
 
-void __intel_ring_advance(struct intel_engine *ring)
+void __intel_ring_advance(struct intel_engine *ring,
+			  struct i915_hw_context *ctx)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 
 	ringbuf->tail &= ringbuf->size - 1;
 	if (intel_ring_stopped(ring))
 		return;
-	ring->write_tail(ring, ringbuf->tail);
+
+	ring->write_tail(ring, ctx, ringbuf->tail);
 }
 
 static int
 gen2_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -94,6 +97,7 @@  gen2_render_ring_flush(struct intel_engine *ring,
 
 static int
 gen4_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32	invalidate_domains,
 		       u32	flush_domains)
 {
@@ -224,7 +228,8 @@  intel_emit_post_sync_nonzero_flush(struct intel_engine *ring)
 
 static int
 gen6_render_ring_flush(struct intel_engine *ring,
-                         u32 invalidate_domains, u32 flush_domains)
+		       struct i915_hw_context *ctx,
+		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
@@ -318,6 +323,7 @@  static int gen7_ring_fbc_flush(struct intel_engine *ring, u32 value)
 
 static int
 gen7_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -379,6 +385,7 @@  gen7_render_ring_flush(struct intel_engine *ring,
 
 static int
 gen8_render_ring_flush(struct intel_engine *ring,
+		       struct i915_hw_context *ctx,
 		       u32 invalidate_domains, u32 flush_domains)
 {
 	u32 flags = 0;
@@ -402,7 +409,7 @@  gen8_render_ring_flush(struct intel_engine *ring,
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
 	}
 
-	ret = intel_ring_begin(ring, 6);
+	ret = intel_ringbuffer_begin(ring, ctx, 6);
 	if (ret)
 		return ret;
 
@@ -419,7 +426,7 @@  gen8_render_ring_flush(struct intel_engine *ring,
 }
 
 static void ring_write_tail(struct intel_engine *ring,
-			    u32 value)
+			    struct i915_hw_context *ctx, u32 value)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	I915_WRITE_TAIL(ring, value);
@@ -466,7 +473,7 @@  static bool stop_ring(struct intel_engine *ring)
 
 	I915_WRITE_CTL(ring, 0);
 	I915_WRITE_HEAD(ring, 0);
-	ring->write_tail(ring, 0);
+	ring->write_tail(ring, ring->default_context, 0);
 
 	if (!IS_GEN2(ring->dev)) {
 		(void)I915_READ_CTL(ring);
@@ -718,7 +725,8 @@  static int gen6_signal(struct intel_engine *signaller,
  * This acts like a signal in the canonical semaphore.
  */
 static int
-gen6_add_request(struct intel_engine *ring)
+gen6_add_request(struct intel_engine *ring,
+		 struct i915_hw_context *ctx)
 {
 	int ret;
 
@@ -730,7 +738,7 @@  gen6_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -799,7 +807,8 @@  do {									\
 } while (0)
 
 static int
-pc_render_add_request(struct intel_engine *ring)
+pc_render_add_request(struct intel_engine *ring,
+		      struct i915_hw_context *ctx)
 {
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	int ret;
@@ -841,7 +850,7 @@  pc_render_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, 0);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -1053,6 +1062,7 @@  void intel_ring_setup_status_page(struct intel_engine *ring)
 
 static int
 bsd_ring_flush(struct intel_engine *ring,
+	       struct i915_hw_context *ctx,
 	       u32     invalidate_domains,
 	       u32     flush_domains)
 {
@@ -1069,7 +1079,8 @@  bsd_ring_flush(struct intel_engine *ring,
 }
 
 static int
-i9xx_add_request(struct intel_engine *ring)
+i9xx_add_request(struct intel_engine *ring,
+		 struct i915_hw_context *ctx)
 {
 	int ret;
 
@@ -1081,7 +1092,7 @@  i9xx_add_request(struct intel_engine *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	return 0;
 }
@@ -1215,6 +1226,7 @@  gen8_ring_put_irq(struct intel_engine *ring)
 
 static int
 i965_dispatch_execbuffer(struct intel_engine *ring,
+			 struct i915_hw_context *ctx,
 			 u64 offset, u32 length,
 			 unsigned flags)
 {
@@ -1238,6 +1250,7 @@  i965_dispatch_execbuffer(struct intel_engine *ring,
 #define I830_BATCH_LIMIT (256*1024)
 static int
 i830_dispatch_execbuffer(struct intel_engine *ring,
+				struct i915_hw_context *ctx,
 				u64 offset, u32 len,
 				unsigned flags)
 {
@@ -1289,6 +1302,7 @@  i830_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 i915_dispatch_execbuffer(struct intel_engine *ring,
+			 struct i915_hw_context *ctx,
 			 u64 offset, u32 len,
 			 unsigned flags)
 {
@@ -1549,7 +1563,8 @@  static int intel_ring_wait_request(struct intel_engine *ring, int n)
 	return 0;
 }
 
-static int ring_wait_for_space(struct intel_engine *ring, int n)
+static int ring_wait_for_space(struct intel_engine *ring,
+			       struct i915_hw_context *ctx, int n)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1562,7 +1577,7 @@  static int ring_wait_for_space(struct intel_engine *ring, int n)
 		return ret;
 
 	/* force the tail write in case we have been skipping them */
-	__intel_ring_advance(ring);
+	__intel_ring_advance(ring, ctx);
 
 	trace_i915_ring_wait_begin(ring);
 	/* With GEM the hangcheck timer should kick us out of the loop,
@@ -1598,14 +1613,15 @@  static int ring_wait_for_space(struct intel_engine *ring, int n)
 	return -EBUSY;
 }
 
-static int intel_wrap_ring_buffer(struct intel_engine *ring)
+static int intel_wrap_ring_buffer(struct intel_engine *ring,
+				  struct i915_hw_context *ctx)
 {
 	uint32_t __iomem *virt;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int rem = ringbuf->size - ringbuf->tail;
 
 	if (ringbuf->space < rem) {
-		int ret = ring_wait_for_space(ring, rem);
+		int ret = ring_wait_for_space(ring, ctx, rem);
 		if (ret)
 			return ret;
 	}
@@ -1664,19 +1680,19 @@  intel_ring_alloc_seqno(struct intel_engine *ring)
 }
 
 static int __intel_ring_prepare(struct intel_engine *ring,
-				int bytes)
+				struct i915_hw_context *ctx, int bytes)
 {
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
 	int ret;
 
 	if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
-		ret = intel_wrap_ring_buffer(ring);
+		ret = intel_wrap_ring_buffer(ring, ctx);
 		if (unlikely(ret))
 			return ret;
 	}
 
 	if (unlikely(ringbuf->space < bytes)) {
-		ret = ring_wait_for_space(ring, bytes);
+		ret = ring_wait_for_space(ring, ctx, bytes);
 		if (unlikely(ret))
 			return ret;
 	}
@@ -1684,8 +1700,9 @@  static int __intel_ring_prepare(struct intel_engine *ring,
 	return 0;
 }
 
-int intel_ring_begin(struct intel_engine *ring,
-		     int num_dwords)
+int intel_ringbuffer_begin(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
+			      int num_dwords)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_ringbuffer *ringbuf = __get_ringbuf(ring);
@@ -1696,7 +1713,7 @@  int intel_ring_begin(struct intel_engine *ring,
 	if (ret)
 		return ret;
 
-	ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
+	ret = __intel_ring_prepare(ring, ctx, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
 
@@ -1750,7 +1767,7 @@  void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno)
 }
 
 static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
-				     u32 value)
+				     struct i915_hw_context *ctx, u32 value)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
@@ -1783,6 +1800,7 @@  static void gen6_bsd_ring_write_tail(struct intel_engine *ring,
 }
 
 static int gen6_bsd_ring_flush(struct intel_engine *ring,
+			       struct i915_hw_context *ctx,
 			       u32 invalidate, u32 flush)
 {
 	uint32_t cmd;
@@ -1819,6 +1837,7 @@  static int gen6_bsd_ring_flush(struct intel_engine *ring,
 
 static int
 gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1827,7 +1846,7 @@  gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 		!(flags & I915_DISPATCH_SECURE);
 	int ret;
 
-	ret = intel_ring_begin(ring, 4);
+	ret = intel_ringbuffer_begin(ring, ctx, 4);
 	if (ret)
 		return ret;
 
@@ -1843,6 +1862,7 @@  gen8_ring_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1864,6 +1884,7 @@  hsw_ring_dispatch_execbuffer(struct intel_engine *ring,
 
 static int
 gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
+			      struct i915_hw_context *ctx,
 			      u64 offset, u32 len,
 			      unsigned flags)
 {
@@ -1886,6 +1907,7 @@  gen6_ring_dispatch_execbuffer(struct intel_engine *ring,
 /* Blitter support (SandyBridge+) */
 
 static int gen6_ring_flush(struct intel_engine *ring,
+			   struct i915_hw_context *ctx,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
@@ -2296,14 +2318,15 @@  int intel_init_vebox_ring(struct drm_device *dev)
 }
 
 int
-intel_ring_flush_all_caches(struct intel_engine *ring)
+intel_ring_flush_all_caches(struct intel_engine *ring,
+			    struct i915_hw_context *ctx)
 {
 	int ret;
 
 	if (!ring->gpu_caches_dirty)
 		return 0;
 
-	ret = ring->flush(ring, 0, I915_GEM_GPU_DOMAINS);
+	ret = ring->flush(ring, ctx, 0, I915_GEM_GPU_DOMAINS);
 	if (ret)
 		return ret;
 
@@ -2314,7 +2337,8 @@  intel_ring_flush_all_caches(struct intel_engine *ring)
 }
 
 int
-intel_ring_invalidate_all_caches(struct intel_engine *ring)
+intel_ring_invalidate_all_caches(struct intel_engine *ring,
+				 struct i915_hw_context *ctx)
 {
 	uint32_t flush_domains;
 	int ret;
@@ -2323,7 +2347,7 @@  intel_ring_invalidate_all_caches(struct intel_engine *ring)
 	if (ring->gpu_caches_dirty)
 		flush_domains = I915_GEM_GPU_DOMAINS;
 
-	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, flush_domains);
+	ret = ring->flush(ring, ctx, I915_GEM_GPU_DOMAINS, flush_domains);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c9328fd..4ed68b4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -75,6 +75,8 @@  struct intel_ringbuffer {
 	u32 last_retired_head;
 };
 
+struct i915_hw_context;
+
 struct intel_engine {
 	const char	*name;
 	enum intel_ring_id {
@@ -101,11 +103,13 @@  struct intel_engine {
 	int		(*init)(struct intel_engine *ring);
 
 	void		(*write_tail)(struct intel_engine *ring,
-				      u32 value);
+				      struct i915_hw_context *ctx, u32 value);
 	int __must_check (*flush)(struct intel_engine *ring,
+				  struct i915_hw_context *ctx,
 				  u32	invalidate_domains,
 				  u32	flush_domains);
-	int		(*add_request)(struct intel_engine *ring);
+	int		(*add_request)(struct intel_engine *ring,
+				       struct i915_hw_context *ctx);
 	/* Some chipsets are not quite as coherent as advertised and need
 	 * an expensive kick to force a true read of the up-to-date seqno.
 	 * However, the up-to-date seqno is not always required and the last
@@ -117,6 +121,7 @@  struct intel_engine {
 	void		(*set_seqno)(struct intel_engine *ring,
 				     u32 seqno);
 	int		(*dispatch_execbuffer)(struct intel_engine *ring,
+					       struct i915_hw_context *ctx,
 					       u64 offset, u32 length,
 					       unsigned flags);
 #define I915_DISPATCH_SECURE 0x1
@@ -289,7 +294,10 @@  intel_write_status_page(struct intel_engine *ring,
 void intel_stop_ring(struct intel_engine *ring);
 void intel_cleanup_ring(struct intel_engine *ring);
 
-int __must_check intel_ring_begin(struct intel_engine *ring, int n);
+int __must_check intel_ringbuffer_begin(struct intel_engine *ring,
+					   struct i915_hw_context *ctx, int n);
+#define intel_ring_begin(ring, n) \
+		intel_ringbuffer_begin(ring, ring->default_context, n)
 int __must_check intel_ring_cacheline_align(struct intel_engine *ring);
 static inline void intel_ring_emit(struct intel_engine *ring,
 				   u32 data)
@@ -305,12 +313,15 @@  static inline void intel_ring_advance(struct intel_engine *ring)
 
 	ringbuf->tail &= ringbuf->size - 1;
 }
-void __intel_ring_advance(struct intel_engine *ring);
+void __intel_ring_advance(struct intel_engine *ring,
+			  struct i915_hw_context *ctx);
 
 int __must_check intel_ring_idle(struct intel_engine *ring);
 void intel_ring_init_seqno(struct intel_engine *ring, u32 seqno);
-int intel_ring_flush_all_caches(struct intel_engine *ring);
-int intel_ring_invalidate_all_caches(struct intel_engine *ring);
+int intel_ring_flush_all_caches(struct intel_engine *ring,
+				struct i915_hw_context *ctx);
+int intel_ring_invalidate_all_caches(struct intel_engine *ring,
+				     struct i915_hw_context *ctx);
 
 void intel_init_rings_early(struct drm_device *dev);
 int intel_init_render_ring(struct drm_device *dev);