diff mbox

[RFC,4/4] drm: i915: Atomic pageflip WIP

Message ID 1347464827-14187-5-git-send-email-ville.syrjala@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ville Syrjälä Sept. 12, 2012, 3:47 p.m. UTC
From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Use the drm_flip helper to implement atomic page flipping.

Work in progress. Ignore the huge mess in intel_sprite.c for now.
---
 drivers/gpu/drm/i915/i915_drv.h      |    4 +
 drivers/gpu/drm/i915/i915_irq.c      |   10 +-
 drivers/gpu/drm/i915/intel_atomic.c  |  449 +++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_display.c |  171 +++++++++----
 drivers/gpu/drm/i915/intel_drv.h     |   29 +++
 drivers/gpu/drm/i915/intel_sprite.c  |  386 ++++++++++++++++++-----------
 6 files changed, 841 insertions(+), 208 deletions(-)

Comments

Chris Wilson Sept. 14, 2012, 1:57 p.m. UTC | #1
On Wed, 12 Sep 2012 18:47:07 +0300, ville.syrjala@linux.intel.com wrote:
> +static void intel_flip_finish(struct drm_flip *flip)
> +{
> +	struct intel_flip *intel_flip =
> +		container_of(flip, struct intel_flip, base);
> +	struct drm_device *dev = intel_flip->crtc->dev;
> +
> +	if (intel_flip->old_bo) {
> +		mutex_lock(&dev->struct_mutex);
> +
> +		intel_finish_fb(intel_flip->old_bo);

So if I understand correctly, this code is called after the flip is
already complete?

The intel_finish_fb() exists to flush pending batches and flips on the
current fb, prior to changing the scanout registers. (There is a
hardware dependency such that the GPU may be executing a command that
required the current modesetting.) In the case of flip completion, all
of those dependencies have already been retired and so the finish should
be a no-op. And so it should no be required, nor the changes to
intel_finish_fb (which should have included a change in the name to
indicate that is now taking the fb_obj).

> +		intel_unpin_fb_obj(intel_flip->old_bo);
> +
> +		mutex_unlock(&dev->struct_mutex);
> +	}
> +
> +}
-Chris
Ville Syrjälä Sept. 14, 2012, 2:21 p.m. UTC | #2
On Fri, Sep 14, 2012 at 02:57:26PM +0100, Chris Wilson wrote:
> On Wed, 12 Sep 2012 18:47:07 +0300, ville.syrjala@linux.intel.com wrote:
> > +static void intel_flip_finish(struct drm_flip *flip)
> > +{
> > +	struct intel_flip *intel_flip =
> > +		container_of(flip, struct intel_flip, base);
> > +	struct drm_device *dev = intel_flip->crtc->dev;
> > +
> > +	if (intel_flip->old_bo) {
> > +		mutex_lock(&dev->struct_mutex);
> > +
> > +		intel_finish_fb(intel_flip->old_bo);
> 
> So if I understand correctly, this code is called after the flip is
> already complete?

Yes.

> The intel_finish_fb() exists to flush pending batches and flips on the
> current fb, prior to changing the scanout registers. (There is a
> hardware dependency such that the GPU may be executing a command that
> required the current modesetting.) In the case of flip completion, all
> of those dependencies have already been retired and so the finish should
> be a no-op. And so it should no be required, nor the changes to
> intel_finish_fb (which should have included a change in the name to
> indicate that is now taking the fb_obj).

Actually I'm not quite sure where this intel_finish_fb() call originated.
Based on the name it didn't make sense to me, but I left it there for
now. Hmm. OK it came from one patch from Imre while I was on vacation.
I suppose he got it from intel_pipe_set_base() which does call
intel_finish_fb() on the old fb. Why does it do that?

I've not really dug into the GPU synchronization side and pin/fence stuff,
so it's no surprise my code is a bit fscked up in those areas.
Ville Syrjälä Sept. 14, 2012, 3:30 p.m. UTC | #3
On Fri, Sep 14, 2012 at 03:27:05PM +0100, Chris Wilson wrote:
> On Fri, 14 Sep 2012 17:21:30 +0300, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote:
> > On Fri, Sep 14, 2012 at 02:57:26PM +0100, Chris Wilson wrote:
> > > On Wed, 12 Sep 2012 18:47:07 +0300, ville.syrjala@linux.intel.com wrote:
> > > > +static void intel_flip_finish(struct drm_flip *flip)
> > > > +{
> > > > +	struct intel_flip *intel_flip =
> > > > +		container_of(flip, struct intel_flip, base);
> > > > +	struct drm_device *dev = intel_flip->crtc->dev;
> > > > +
> > > > +	if (intel_flip->old_bo) {
> > > > +		mutex_lock(&dev->struct_mutex);
> > > > +
> > > > +		intel_finish_fb(intel_flip->old_bo);
> > > 
> > > So if I understand correctly, this code is called after the flip is
> > > already complete?
> > 
> > Yes.
> > 
> > > The intel_finish_fb() exists to flush pending batches and flips on the
> > > current fb, prior to changing the scanout registers. (There is a
> > > hardware dependency such that the GPU may be executing a command that
> > > required the current modesetting.) In the case of flip completion, all
> > > of those dependencies have already been retired and so the finish should
> > > be a no-op. And so it should no be required, nor the changes to
> > > intel_finish_fb (which should have included a change in the name to
> > > indicate that is now taking the fb_obj).
> > 
> > Actually I'm not quite sure where this intel_finish_fb() call originated.
> > Based on the name it didn't make sense to me, but I left it there for
> > now. Hmm. OK it came from one patch from Imre while I was on vacation.
> > I suppose he got it from intel_pipe_set_base() which does call
> > intel_finish_fb() on the old fb. Why does it do that?
> 
> It all boils down to the modeset being asynchronous to the GPU
> processing the command stream. So we may be currently processing a batch
> that is waiting on the pipe to go past a particular scanline, and if the
> modesetting were to disable that pipe, or to change its size, then we
> risk the WAIT_FOR_EVENT never completing - leading to hangcheck
> detecting the frozen display and an angry user.

intel_pipe_set_base() won't disable the pipe or change the size,
it'll just flip the primary plane. So that doesn't quite explain
why the call is there, as opposed to being called just from the
full modeset path.

Also wouldn't any batch buffer with WAIT_FOR_EVENT be in risk of
stalling, not just ones related to the old fb?

> The other aspect is to synchronize the modeset with any outstanding
> pageflips.

Right, that does make sense. But doing it from a function called
intel_finish_fb() is a bit confusing, as the condition really
shouldn't depend on any specific fb object. But I suppose this is
just a result of the "only one outstanding flip" policy.


BTW regarding this WAIT_FOR_EVENT thing. I got the impression that
the scanline window wait doesn't work on recent hardware generations
any more. Is that true? I was thinking that perhaps I could use it
along with the load register command to perform the flips through
the command queue.
Daniel Vetter Sept. 14, 2012, 3:39 p.m. UTC | #4
On Fri, Sep 14, 2012 at 5:30 PM, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> intel_pipe_set_base() won't disable the pipe or change the size,
> it'll just flip the primary plane. So that doesn't quite explain
> why the call is there, as opposed to being called just from the
> full modeset path.

intel_pipe_set_base is also called in the modeset case, i.e. when we
could potentially change the height of the mode. And if we wait on a
large enough scanline which doesn't exist in the new mode this would
hang.

The other callsite of finish_fb is from intel_crtc_disable.

btw, some slight digging around with git blame gives you in both cases
the commits and comments that explain this all ;-)

Cheers, Daniel
Ville Syrjälä Sept. 14, 2012, 3:52 p.m. UTC | #5
On Fri, Sep 14, 2012 at 05:39:30PM +0200, Daniel Vetter wrote:
> On Fri, Sep 14, 2012 at 5:30 PM, Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > intel_pipe_set_base() won't disable the pipe or change the size,
> > it'll just flip the primary plane. So that doesn't quite explain
> > why the call is there, as opposed to being called just from the
> > full modeset path.
> 
> intel_pipe_set_base is also called in the modeset case, i.e. when we
> could potentially change the height of the mode. And if we wait on a
> large enough scanline which doesn't exist in the new mode this would
> hang.

Yes, I know it's called in both cases. But my point is that there
doesn't seem to be any reason to call it in the pure set_base case.

> The other callsite of finish_fb is from intel_crtc_disable.

Yep. There it does make sense to me.
Ville Syrjälä Sept. 14, 2012, 4:07 p.m. UTC | #6
On Fri, Sep 14, 2012 at 04:56:00PM +0100, Chris Wilson wrote:
> On Fri, 14 Sep 2012 18:30:21 +0300, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote:
> > On Fri, Sep 14, 2012 at 03:27:05PM +0100, Chris Wilson wrote:
> > > On Fri, 14 Sep 2012 17:21:30 +0300, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote:
> > > > On Fri, Sep 14, 2012 at 02:57:26PM +0100, Chris Wilson wrote:
> > > > > On Wed, 12 Sep 2012 18:47:07 +0300, ville.syrjala@linux.intel.com wrote:
> > > > > > +static void intel_flip_finish(struct drm_flip *flip)
> > > > > > +{
> > > > > > +	struct intel_flip *intel_flip =
> > > > > > +		container_of(flip, struct intel_flip, base);
> > > > > > +	struct drm_device *dev = intel_flip->crtc->dev;
> > > > > > +
> > > > > > +	if (intel_flip->old_bo) {
> > > > > > +		mutex_lock(&dev->struct_mutex);
> > > > > > +
> > > > > > +		intel_finish_fb(intel_flip->old_bo);
> > > > > 
> > > > > So if I understand correctly, this code is called after the flip is
> > > > > already complete?
> > > > 
> > > > Yes.
> > > > 
> > > > > The intel_finish_fb() exists to flush pending batches and flips on the
> > > > > current fb, prior to changing the scanout registers. (There is a
> > > > > hardware dependency such that the GPU may be executing a command that
> > > > > required the current modesetting.) In the case of flip completion, all
> > > > > of those dependencies have already been retired and so the finish should
> > > > > be a no-op. And so it should no be required, nor the changes to
> > > > > intel_finish_fb (which should have included a change in the name to
> > > > > indicate that is now taking the fb_obj).
> > > > 
> > > > Actually I'm not quite sure where this intel_finish_fb() call originated.
> > > > Based on the name it didn't make sense to me, but I left it there for
> > > > now. Hmm. OK it came from one patch from Imre while I was on vacation.
> > > > I suppose he got it from intel_pipe_set_base() which does call
> > > > intel_finish_fb() on the old fb. Why does it do that?
> > > 
> > > It all boils down to the modeset being asynchronous to the GPU
> > > processing the command stream. So we may be currently processing a batch
> > > that is waiting on the pipe to go past a particular scanline, and if the
> > > modesetting were to disable that pipe, or to change its size, then we
> > > risk the WAIT_FOR_EVENT never completing - leading to hangcheck
> > > detecting the frozen display and an angry user.
> > 
> > intel_pipe_set_base() won't disable the pipe or change the size,
> > it'll just flip the primary plane. So that doesn't quite explain
> > why the call is there, as opposed to being called just from the
> > full modeset path.
> 
> Hmm, at the time it was a convenient point. Now, it is clearly called too
> late in the modeset sequence. Daniel, fix please. :)
> 
> > Also wouldn't any batch buffer with WAIT_FOR_EVENT be in risk of
> > stalling, not just ones related to the old fb?
> > 
> > > The other aspect is to synchronize the modeset with any outstanding
> > > pageflips.
> > 
> > Right, that does make sense. But doing it from a function called
> > intel_finish_fb() is a bit confusing, as the condition really
> > shouldn't depend on any specific fb object. But I suppose this is
> > just a result of the "only one outstanding flip" policy.
> 
> Again, a nice convenient point, calling it an intel_crtc_wait_*() would
> probably help (after fixing the ordering).
> 
> > BTW regarding this WAIT_FOR_EVENT thing. I got the impression that
> > the scanline window wait doesn't work on recent hardware generations
> > any more. Is that true? I was thinking that perhaps I could use it
> > along with the load register command to perform the flips through
> > the command queue.
> 
> That impression is pretty accurate. There is a suggestion that some form
> of scanline wait was restored for IVB, but driving it seems pretty hit
> and miss. Atomic flipping should all be possible with MI_DISPLAY_FLIP,
> so presumably you are mostly thinking about atomic modeset? Is the
> presumption that it will be an infrequent request and so better to keep
> as simple as possible?

MI_DISPLAY_FLIP is _very_ limited. You can't really do anything
interesting with. Sure it's enough to flip the buffers of some game
or whatever, but it's practically useless for hardware composition
type stuff.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 57e4894..e29994f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -41,6 +41,8 @@ 
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 
+#include "drm_flip.h"
+
 /* General customization:
  */
 
@@ -845,6 +847,8 @@  typedef struct drm_i915_private {
 	struct work_struct parity_error_work;
 	bool hw_contexts_disabled;
 	uint32_t hw_context_size;
+
+	struct drm_flip_driver flip_driver;
 } drm_i915_private_t;
 
 /* Iterate over initialised rings */
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 23f2ea0..78ff3e0 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -37,6 +37,8 @@ 
 #include "i915_trace.h"
 #include "intel_drv.h"
 
+void intel_handle_vblank(struct drm_device *dev, int pipe);
+
 /* For display hotplug interrupt */
 static void
 ironlake_enable_display_irq(drm_i915_private_t *dev_priv, u32 mask)
@@ -548,7 +550,7 @@  static irqreturn_t valleyview_irq_handler(DRM_IRQ_ARGS)
 
 		for_each_pipe(pipe) {
 			if (pipe_stats[pipe] & PIPE_VBLANK_INTERRUPT_STATUS)
-				drm_handle_vblank(dev, pipe);
+				intel_handle_vblank(dev, pipe);
 
 			if (pipe_stats[pipe] & PLANE_FLIPDONE_INT_STATUS_VLV) {
 				intel_prepare_page_flip(dev, pipe);
@@ -686,7 +688,7 @@  static irqreturn_t ivybridge_irq_handler(DRM_IRQ_ARGS)
 				intel_finish_page_flip_plane(dev, i);
 			}
 			if (de_iir & (DE_PIPEA_VBLANK_IVB << (5 * i)))
-				drm_handle_vblank(dev, i);
+				intel_handle_vblank(dev, i);
 		}
 
 		/* check event from PCH */
@@ -779,10 +781,10 @@  static irqreturn_t ironlake_irq_handler(DRM_IRQ_ARGS)
 	}
 
 	if (de_iir & DE_PIPEA_VBLANK)
-		drm_handle_vblank(dev, 0);
+		intel_handle_vblank(dev, 0);
 
 	if (de_iir & DE_PIPEB_VBLANK)
-		drm_handle_vblank(dev, 1);
+		intel_handle_vblank(dev, 1);
 
 	/* check event from PCH */
 	if (de_iir & DE_PCH_EVENT) {
diff --git a/drivers/gpu/drm/i915/intel_atomic.c b/drivers/gpu/drm/i915/intel_atomic.c
index 0a96d15..6d8d7d3 100644
--- a/drivers/gpu/drm/i915/intel_atomic.c
+++ b/drivers/gpu/drm/i915/intel_atomic.c
@@ -3,6 +3,7 @@ 
 
 #include <drm/drmP.h>
 #include <drm/drm_crtc.h>
+#include <drm/drm_flip.h>
 
 #include "intel_drv.h"
 
@@ -1309,6 +1310,46 @@  static void update_props(const struct drm_device *dev,
 	}
 }
 
+static int atomic_pipe_commit(struct drm_device *dev,
+			      struct intel_atomic_state *state,
+			      int pipe);
+
+static int try_atomic_commit(struct drm_device *dev, struct intel_atomic_state *s)
+{
+	int ret;
+	int i;
+	u32 pipes = 0;
+
+	for (i = 0; i < dev->mode_config.num_crtc; i++) {
+		struct intel_crtc_state *st = &s->crtc[i];
+
+		if (st->mode_dirty)
+			return -EAGAIN;
+
+		if (st->fb_dirty)
+			pipes |= 1 << to_intel_crtc(st->crtc)->pipe;
+	}
+
+	for (i = 0; i < dev->mode_config.num_plane; i++) {
+		struct intel_plane_state *st = &s->plane[i];
+
+		if (st->dirty)
+			pipes |= 1 << to_intel_plane(st->plane)->pipe;
+	}
+
+	if (hweight32(pipes) != 1)
+		return -EAGAIN;
+
+	ret = atomic_pipe_commit(dev, s, ffs(pipes) - 1);
+	if (ret)
+		return ret;
+
+	/* don't restore the old state in end() */
+	s->dirty = false;
+
+	return 0;
+}
+
 static int intel_atomic_commit(struct drm_device *dev, void *state)
 {
 	struct intel_atomic_state *s = state;
@@ -1325,17 +1366,23 @@  static int intel_atomic_commit(struct drm_device *dev, void *state)
 	if (ret)
 		return ret;
 
-	ret = apply_config(dev, s);
+	/* try to apply asynchronously and atomically */
+	ret = try_atomic_commit(dev, s);
+
+	/* fall back to sync/non-atomic */
 	if (ret) {
-		unpin_fbs(dev, s);
-		unpin_cursors(dev, s);
-		s->restore_hw = true;
-		return ret;
-	}
+		ret = apply_config(dev, s);
+		if (ret) {
+			unpin_fbs(dev, s);
+			unpin_cursors(dev, s);
+			s->restore_hw = true;
+			return ret;
+		}
 
-	unpin_old_fbs(dev, s);
+		unpin_old_fbs(dev, s);
 
-	unpin_old_cursors(dev, s);
+		unpin_old_cursors(dev, s);
+	}
 
 	update_plane_obj(dev, s);
 
@@ -1390,6 +1437,9 @@  static struct {
 	{ &prop_cursor_y, "CURSOR_Y", INT_MIN, INT_MAX },
 };
 
+static void intel_flip_init(struct drm_device *dev);
+static void intel_flip_fini(struct drm_device *dev);
+
 int intel_atomic_init(struct drm_device *dev)
 {
 	struct drm_crtc *crtc;
@@ -1452,6 +1502,8 @@  int intel_atomic_init(struct drm_device *dev)
 
 	unpin_workqueue = create_workqueue("i915_fb_unpin");
 
+	intel_flip_init(dev);
+
 	return 0;
 
  out:
@@ -1465,6 +1517,8 @@  void intel_atomic_fini(struct drm_device *dev)
 {
 	struct drm_crtc *crtc;
 
+	intel_flip_fini(dev);
+
 	destroy_workqueue(unpin_workqueue);
 
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
@@ -1485,3 +1539,382 @@  void intel_atomic_fini(struct drm_device *dev)
 	drm_property_destroy(dev, prop_src_y);
 	drm_property_destroy(dev, prop_src_x);
 }
+
+void intel_plane_calc(struct drm_crtc *crtc, struct drm_framebuffer *fb, int x, int y);
+void intel_plane_prepare(struct drm_crtc *crtc);
+void intel_plane_commit(struct drm_crtc *crtc);
+void intel_sprite_calc(struct drm_plane *plane, struct drm_framebuffer *fb, const struct intel_plane_coords *coords);
+void intel_sprite_prepare(struct drm_plane *plane);
+void intel_sprite_commit(struct drm_plane *plane);
+
+enum {
+	/* somwehat arbitrary value */
+	INTEL_VBL_CNT_TIMEOUT = 5,
+};
+
+struct intel_flip
+{
+	struct drm_flip base;
+	u32 vbl_count;
+	bool vblank_ref;
+	struct drm_crtc *crtc;
+	struct drm_plane *plane;
+	struct drm_i915_gem_object *old_bo;
+};
+
+static void intel_flip_complete(struct drm_flip *flip)
+{
+	struct intel_flip *intel_flip =
+		container_of(flip, struct intel_flip, base);
+	struct drm_device *dev = intel_flip->crtc->dev;
+	int pipe = to_intel_crtc(intel_flip->crtc)->pipe;
+
+	// send flip event
+
+	if (intel_flip->vblank_ref)
+		drm_vblank_put(dev, pipe);
+
+	// allow GPU to render to old_obj
+}
+
+static void intel_flip_finish(struct drm_flip *flip)
+{
+	struct intel_flip *intel_flip =
+		container_of(flip, struct intel_flip, base);
+	struct drm_device *dev = intel_flip->crtc->dev;
+
+	if (intel_flip->old_bo) {
+		mutex_lock(&dev->struct_mutex);
+
+		intel_finish_fb(intel_flip->old_bo);
+		intel_unpin_fb_obj(intel_flip->old_bo);
+
+		mutex_unlock(&dev->struct_mutex);
+	}
+
+}
+
+static void intel_flip_cleanup(struct drm_flip *flip)
+{
+	struct intel_flip *intel_flip =
+		container_of(flip, struct intel_flip, base);
+
+	kfree(intel_flip);
+}
+
+static void intel_flip_driver_flush(struct drm_flip_driver *driver)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(driver, struct drm_i915_private, flip_driver);
+
+	/* Flush posted writes */
+	I915_READ(PIPESTAT(PIPE_A));
+}
+
+static u32 get_vbl_count(struct drm_crtc *crtc)
+{
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct drm_i915_private *dev_priv = crtc->dev->dev_private;
+	int pipe = intel_crtc->pipe;
+	u32 high, low1, low2, dsl;
+	unsigned int timeout = 0;
+
+	/*
+	 * FIXME check where the frame counter increments, and if
+	 * it happens in the middle of some line, take appropriate
+	 * measures to get a sensible reading.
+	 */
+
+	/* All reads must be satisfied during the same frame */
+	do {
+		low1 = I915_READ(PIPEFRAMEPIXEL(pipe)) >> PIPE_FRAME_LOW_SHIFT;
+		high = I915_READ(PIPEFRAME(pipe)) << 8;
+		dsl = I915_READ(PIPEDSL(pipe));
+		low2 = I915_READ(PIPEFRAMEPIXEL(pipe)) >> PIPE_FRAME_LOW_SHIFT;
+	} while (low1 != low2 && timeout++ < INTEL_VBL_CNT_TIMEOUT);
+
+	if (timeout >= INTEL_VBL_CNT_TIMEOUT)
+		dev_warn(crtc->dev->dev,
+			 "Timed out while determining VBL count for pipe %d\n",
+			 intel_crtc->pipe);
+
+	return ((high | low2) +
+		((dsl >= crtc->hwmode.crtc_vdisplay) &&
+		 (dsl < crtc->hwmode.crtc_vtotal - 1))) & 0xffffff;
+}
+
+static unsigned int usecs_to_scanlines(struct drm_crtc *crtc,
+				       unsigned int usecs)
+{
+	/* paranoia */
+	if (!crtc->hwmode.crtc_htotal)
+		return 1;
+
+	return DIV_ROUND_UP(usecs * crtc->hwmode.clock,
+			    1000 * crtc->hwmode.crtc_htotal);
+}
+
+static void intel_pipe_vblank_evade(struct drm_crtc *crtc)
+{
+	struct drm_device *dev = crtc->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	int pipe = intel_crtc->pipe;
+	/* FIXME needs to be calibrated sensibly */
+	u32 min = crtc->hwmode.crtc_vdisplay - usecs_to_scanlines(crtc, 60);
+	u32 max = crtc->hwmode.crtc_vdisplay - 1;
+	long timeout = msecs_to_jiffies(1);
+	u32 val;
+
+	bool vblank_ref = drm_vblank_get(dev, pipe) == 0;
+
+	intel_crtc->vbl_received = false;
+
+	val = I915_READ(PIPEDSL(pipe));
+
+	while (val >= min && val <= max && timeout > 0) {
+		local_irq_enable();
+
+		timeout = wait_event_timeout(intel_crtc->vbl_wait,
+					     intel_crtc->vbl_received,
+					     timeout);
+
+		local_irq_disable();
+
+		intel_crtc->vbl_received = false;
+
+		val = I915_READ(PIPEDSL(pipe));
+	}
+
+	if (vblank_ref)
+		drm_vblank_put(dev, pipe);
+
+	if (val >= min && val <= max)
+		dev_warn(dev->dev,
+			 "Page flipping close to vblank start (DSL=%u, VBL=%u)\n",
+			 val, crtc->hwmode.crtc_vdisplay);
+}
+
+static bool vbl_count_after_eq(u32 a, u32 b)
+{
+	return !((a - b) & 0x800000);
+}
+
+static bool intel_vbl_check(struct drm_flip *pending_flip, u32 vbl_count)
+{
+	struct intel_flip *old_intel_flip =
+		container_of(pending_flip, struct intel_flip, base);
+
+	return vbl_count_after_eq(vbl_count, old_intel_flip->vbl_count);
+}
+
+static void intel_flip_prepare(struct drm_flip *flip)
+{
+	struct intel_flip *intel_flip =
+		container_of(flip, struct intel_flip, base);
+
+	/* FIXME some other pipe/pf stuff could be performed here as well. */
+
+	/* stage double buffer updates which need arming by something else */
+	if (intel_flip->plane)
+		intel_sprite_prepare(intel_flip->plane);
+	else
+		intel_plane_prepare(intel_flip->crtc);
+}
+
+static bool intel_flip_flip(struct drm_flip *flip,
+			    struct drm_flip *pending_flip)
+{
+	struct intel_flip *intel_flip = container_of(flip, struct intel_flip, base);
+	struct drm_crtc *crtc = intel_flip->crtc;
+	struct drm_device *dev = crtc->dev;
+	int pipe = to_intel_crtc(crtc)->pipe;
+	u32 vbl_count;
+
+	struct intel_flip *old_intel_flip = NULL;
+	if (pending_flip)
+		old_intel_flip = container_of(pending_flip, struct intel_flip, base);
+
+	intel_flip->vblank_ref = drm_vblank_get(dev, pipe) == 0;
+
+	vbl_count = get_vbl_count(crtc);
+
+	/* arm all the double buffer registers */
+	if (intel_flip->plane)
+		intel_sprite_commit(intel_flip->plane);
+	else
+		intel_plane_commit(crtc);
+
+	/* This flip will happen on the next vblank */
+	intel_flip->vbl_count = (vbl_count + 1) & 0xffffff;
+
+	/* FIXME oops in unpin when swapping old bos */
+#if 0
+	if (pending_flip) {
+		struct intel_flip *old_intel_flip =
+			container_of(pending_flip, struct intel_flip, base);
+		bool flipped = intel_vbl_check(pending_flip, vbl_count);
+
+		if (!flipped)
+			swap(intel_flip->old_bo, old_intel_flip->old_bo);
+
+		return flipped;
+	}
+#endif
+
+	return false;
+}
+
+static bool intel_flip_vblank(struct drm_flip *pending_flip)
+{
+	struct intel_flip *old_intel_flip =
+		container_of(pending_flip, struct intel_flip, base);
+
+	u32 vbl_count = get_vbl_count(old_intel_flip->crtc);
+
+	return intel_vbl_check(pending_flip, vbl_count);
+}
+
+static const struct drm_flip_helper_funcs intel_flip_funcs = {
+	.prepare = intel_flip_prepare,
+	.flip = intel_flip_flip,
+	.vblank = intel_flip_vblank,
+	.complete = intel_flip_complete,
+	.finish = intel_flip_finish,
+	.cleanup = intel_flip_cleanup,
+};
+
+static const struct drm_flip_driver_funcs intel_flip_driver_funcs = {
+	.flush = intel_flip_driver_flush,
+};
+
+static void intel_flip_init(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	struct intel_plane *intel_plane;
+
+	drm_flip_driver_init(&dev_priv->flip_driver, &intel_flip_driver_funcs);
+
+	list_for_each_entry(intel_crtc, &dev->mode_config.crtc_list, base.head) {
+		init_waitqueue_head(&intel_crtc->vbl_wait);
+
+		drm_flip_helper_init(&intel_crtc->flip_helper,
+				     &dev_priv->flip_driver, &intel_flip_funcs);
+	}
+
+	list_for_each_entry(intel_plane, &dev->mode_config.plane_list, base.head)
+		drm_flip_helper_init(&intel_plane->flip_helper,
+				     &dev_priv->flip_driver, &intel_flip_funcs);
+}
+
+static void intel_flip_fini(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_crtc *crtc;
+	struct drm_plane *plane;
+
+	list_for_each_entry(plane, &dev->mode_config.plane_list, head)
+		drm_flip_helper_fini(&to_intel_plane(plane)->flip_helper);
+
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head)
+		drm_flip_helper_fini(&to_intel_crtc(crtc)->flip_helper);
+
+	drm_flip_driver_fini(&dev_priv->flip_driver);
+}
+
+static int atomic_pipe_commit(struct drm_device *dev,
+			      struct intel_atomic_state *state,
+			      int pipe)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_flip *intel_flip, *next;
+	LIST_HEAD(flips);
+	int i;
+
+	for (i = 0; i < dev->mode_config.num_crtc; i++) {
+		struct intel_crtc_state *st = &state->crtc[i];
+		struct drm_crtc *crtc = st->crtc;
+		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+
+		if (!st->fb_dirty)
+			continue;
+
+		if (intel_crtc->pipe != pipe)
+			continue;
+
+		intel_flip = kzalloc(sizeof *intel_flip, GFP_KERNEL);
+		if (!intel_flip)
+			goto out;
+
+		intel_plane_calc(crtc, crtc->fb, crtc->x, crtc->y);
+
+		drm_flip_init(&intel_flip->base, &intel_crtc->flip_helper);
+		intel_flip->crtc = crtc;
+		if (st->old_fb)
+			intel_flip->old_bo = to_intel_framebuffer(st->old_fb)->obj;
+
+		list_add_tail(&intel_flip->base.list, &flips);
+	}
+
+	for (i = 0; i < dev->mode_config.num_plane; i++) {
+		struct intel_plane_state *st = &state->plane[i];
+		struct drm_plane *plane = st->plane;
+		struct intel_plane *intel_plane = to_intel_plane(plane);
+
+		if (!st->dirty)
+			continue;
+
+		if (intel_plane->pipe != pipe)
+			continue;
+
+		intel_flip = kzalloc(sizeof *intel_flip, GFP_KERNEL);
+		if (!intel_flip)
+			goto out;
+
+		intel_sprite_calc(plane, plane->fb, &st->coords);
+
+		drm_flip_init(&intel_flip->base, &intel_plane->flip_helper);
+		intel_flip->crtc = intel_get_crtc_for_pipe(dev, pipe);
+		intel_flip->plane = plane;
+		if (st->old_fb)
+			intel_flip->old_bo = to_intel_framebuffer(st->old_fb)->obj;
+
+		list_add_tail(&intel_flip->base.list, &flips);
+	}
+
+	/* FIXME if pipe is not enabled complete flip immediately */
+
+	drm_flip_driver_prepare_flips(&dev_priv->flip_driver, &flips);
+
+	local_irq_disable();
+
+	intel_pipe_vblank_evade(intel_get_crtc_for_pipe(dev, pipe));
+
+	drm_flip_driver_schedule_flips(&dev_priv->flip_driver, &flips);
+
+	local_irq_enable();
+
+	return 0;
+
+ out:
+	list_for_each_entry_safe(intel_flip, next, &flips, base.list)
+		kfree(intel_flip);
+
+	return -ENOMEM;
+}
+
+void intel_handle_vblank(struct drm_device *dev, int pipe)
+{
+	struct intel_crtc *intel_crtc = to_intel_crtc(intel_get_crtc_for_pipe(dev, pipe));
+	struct intel_plane *intel_plane;
+
+	drm_handle_vblank(dev, pipe);
+
+	drm_flip_helper_vblank(&intel_crtc->flip_helper);
+
+	list_for_each_entry(intel_plane, &dev->mode_config.plane_list, base.head) {
+		if (intel_plane->pipe == pipe)
+			drm_flip_helper_vblank(&intel_plane->flip_helper);
+	}
+}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 01c1a19..23fbc0b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1988,8 +1988,36 @@  static unsigned long gen4_compute_dspaddr_offset_xtiled(int *x, int *y,
 	return tile_rows * pitch * 8 + tiles * 4096;
 }
 
-static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
-			     int x, int y)
+void intel_plane_prepare(struct drm_crtc *crtc)
+{
+	struct drm_device *dev = crtc->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	int plane = intel_crtc->plane;
+	const struct intel_plane_regs *regs = &intel_crtc->primary_regs;
+
+	I915_WRITE(DSPCNTR(plane), regs->cntr);
+	I915_WRITE(DSPSTRIDE(plane), regs->stride);
+}
+
+void intel_plane_commit(struct drm_crtc *crtc)
+{
+	struct drm_device *dev = crtc->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	int plane = intel_crtc->plane;
+	const struct intel_plane_regs *regs = &intel_crtc->primary_regs;
+
+	if (INTEL_INFO(dev)->gen >= 4) {
+		I915_WRITE(DSPSURF(plane), regs->surf);
+		I915_WRITE(DSPTILEOFF(plane), regs->tileoff);
+		I915_WRITE(DSPLINOFF(plane), regs->linoff);
+	} else
+		I915_WRITE(DSPADDR(plane), regs->addr);
+}
+
+static int i9xx_calc_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
+			   int x, int y)
 {
 	struct drm_device *dev = crtc->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1998,9 +2026,8 @@  static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 	struct drm_i915_gem_object *obj;
 	int plane = intel_crtc->plane;
 	unsigned long linear_offset;
-	u32 dspcntr;
-	u32 reg;
 	unsigned int cpp = drm_format_plane_cpp(fb->pixel_format, 0);
+	struct intel_plane_regs *regs = &intel_crtc->primary_regs;
 
 	switch (plane) {
 	case 0:
@@ -2014,36 +2041,35 @@  static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 	intel_fb = to_intel_framebuffer(fb);
 	obj = intel_fb->obj;
 
-	reg = DSPCNTR(plane);
-	dspcntr = I915_READ(reg);
+	regs->cntr = I915_READ(DSPCNTR(plane));
 	/* Mask out pixel format bits in case we change it */
-	dspcntr &= ~DISPPLANE_PIXFORMAT_MASK;
+	regs->cntr &= ~DISPPLANE_PIXFORMAT_MASK;
 	switch (fb->pixel_format) {
 	case DRM_FORMAT_C8:
-		dspcntr |= DISPPLANE_8BPP;
+		regs->cntr |= DISPPLANE_8BPP;
 		break;
 	case DRM_FORMAT_XRGB1555:
 	case DRM_FORMAT_ARGB1555:
-		dspcntr |= DISPPLANE_BGRX555;
+		regs->cntr |= DISPPLANE_BGRX555;
 		break;
 	case DRM_FORMAT_RGB565:
-		dspcntr |= DISPPLANE_BGRX565;
+		regs->cntr |= DISPPLANE_BGRX565;
 		break;
 	case DRM_FORMAT_XRGB8888:
 	case DRM_FORMAT_ARGB8888:
-		dspcntr |= DISPPLANE_BGRX888;
+		regs->cntr |= DISPPLANE_BGRX888;
 		break;
 	case DRM_FORMAT_XBGR8888:
 	case DRM_FORMAT_ABGR8888:
-		dspcntr |= DISPPLANE_RGBX888;
+		regs->cntr |= DISPPLANE_RGBX888;
 		break;
 	case DRM_FORMAT_XRGB2101010:
 	case DRM_FORMAT_ARGB2101010:
-		dspcntr |= DISPPLANE_BGRX101010;
+		regs->cntr |= DISPPLANE_BGRX101010;
 		break;
 	case DRM_FORMAT_XBGR2101010:
 	case DRM_FORMAT_ABGR2101010:
-		dspcntr |= DISPPLANE_RGBX101010;
+		regs->cntr |= DISPPLANE_RGBX101010;
 		break;
 	default:
 		DRM_ERROR("Unknown pixel format 0x%08x\n", fb->pixel_format);
@@ -2052,13 +2078,11 @@  static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 
 	if (INTEL_INFO(dev)->gen >= 4) {
 		if (obj->tiling_mode != I915_TILING_NONE)
-			dspcntr |= DISPPLANE_TILED;
+			regs->cntr |= DISPPLANE_TILED;
 		else
-			dspcntr &= ~DISPPLANE_TILED;
+			regs->cntr &= ~DISPPLANE_TILED;
 	}
 
-	I915_WRITE(reg, dspcntr);
-
 	linear_offset = fb->offsets[0] + y * fb->pitches[0] + x * cpp;
 
 	if (INTEL_INFO(dev)->gen >= 4) {
@@ -2072,21 +2096,42 @@  static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 
 	DRM_DEBUG_KMS("Writing base %08X %08lX %d %d %d\n",
 		      obj->gtt_offset, linear_offset, x, y, fb->pitches[0]);
-	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
+
+	regs->stride = fb->pitches[0];
+
 	if (INTEL_INFO(dev)->gen >= 4) {
-		I915_MODIFY_DISPBASE(DSPSURF(plane),
-				     obj->gtt_offset + intel_crtc->dspaddr_offset);
-		I915_WRITE(DSPTILEOFF(plane), (y << 16) | x);
-		I915_WRITE(DSPLINOFF(plane), linear_offset);
+		regs->surf = I915_LO_DISPBASE(I915_READ(DSPSURF(plane)));
+		regs->surf |= obj->gtt_offset + intel_crtc->dspaddr_offset;
+		regs->tileoff = (y << 16) | x;
+		regs->linoff = linear_offset;
 	} else
-		I915_WRITE(DSPADDR(plane), obj->gtt_offset + linear_offset);
-	POSTING_READ(reg);
+		regs->addr = obj->gtt_offset + linear_offset;
 
 	return 0;
 }
 
-static int ironlake_update_plane(struct drm_crtc *crtc,
-				 struct drm_framebuffer *fb, int x, int y)
+static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
+			     int x, int y)
+{
+	struct drm_i915_private *dev_priv = crtc->dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	int plane = intel_crtc->plane;
+	int ret;
+
+	ret = i9xx_calc_plane(crtc, fb, x, y);
+	if (ret)
+		return ret;
+
+	intel_plane_prepare(crtc);
+	intel_plane_commit(crtc);
+
+	POSTING_READ(DSPCNTR(plane));
+
+	return 0;
+}
+
+static int ironlake_calc_plane(struct drm_crtc *crtc,
+			       struct drm_framebuffer *fb, int x, int y)
 {
 	struct drm_device *dev = crtc->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -2095,9 +2140,8 @@  static int ironlake_update_plane(struct drm_crtc *crtc,
 	struct drm_i915_gem_object *obj;
 	int plane = intel_crtc->plane;
 	unsigned long linear_offset;
-	u32 dspcntr;
-	u32 reg;
 	unsigned int cpp = drm_format_plane_cpp(fb->pixel_format, 0);
+	struct intel_plane_regs *regs = &intel_crtc->primary_regs;
 
 	switch (plane) {
 	case 0:
@@ -2112,32 +2156,31 @@  static int ironlake_update_plane(struct drm_crtc *crtc,
 	intel_fb = to_intel_framebuffer(fb);
 	obj = intel_fb->obj;
 
-	reg = DSPCNTR(plane);
-	dspcntr = I915_READ(reg);
+	regs->cntr = I915_READ(DSPCNTR(plane));
 	/* Mask out pixel format bits in case we change it */
-	dspcntr &= ~DISPPLANE_PIXFORMAT_MASK;
+	regs->cntr &= ~DISPPLANE_PIXFORMAT_MASK;
 	switch (fb->pixel_format) {
 	case DRM_FORMAT_C8:
-		dspcntr |= DISPPLANE_8BPP;
+		regs->cntr |= DISPPLANE_8BPP;
 		break;
 	case DRM_FORMAT_RGB565:
-		dspcntr |= DISPPLANE_BGRX565;
+		regs->cntr |= DISPPLANE_BGRX565;
 		break;
 	case DRM_FORMAT_XRGB8888:
 	case DRM_FORMAT_ARGB8888:
-		dspcntr |= DISPPLANE_BGRX888;
+		regs->cntr |= DISPPLANE_BGRX888;
 		break;
 	case DRM_FORMAT_XBGR8888:
 	case DRM_FORMAT_ABGR8888:
-		dspcntr |= DISPPLANE_RGBX888;
+		regs->cntr |= DISPPLANE_RGBX888;
 		break;
 	case DRM_FORMAT_XRGB2101010:
 	case DRM_FORMAT_ARGB2101010:
-		dspcntr |= DISPPLANE_BGRX101010;
+		regs->cntr |= DISPPLANE_BGRX101010;
 		break;
 	case DRM_FORMAT_XBGR2101010:
 	case DRM_FORMAT_ABGR2101010:
-		dspcntr |= DISPPLANE_RGBX101010;
+		regs->cntr |= DISPPLANE_RGBX101010;
 		break;
 	default:
 		DRM_ERROR("Unknown pixel format 0x%08x\n", fb->pixel_format);
@@ -2145,33 +2188,57 @@  static int ironlake_update_plane(struct drm_crtc *crtc,
 	}
 
 	if (obj->tiling_mode != I915_TILING_NONE)
-		dspcntr |= DISPPLANE_TILED;
+		regs->cntr |= DISPPLANE_TILED;
 	else
-		dspcntr &= ~DISPPLANE_TILED;
+		regs->cntr &= ~DISPPLANE_TILED;
 
 	/* must disable */
-	dspcntr |= DISPPLANE_TRICKLE_FEED_DISABLE;
-
-	I915_WRITE(reg, dspcntr);
+	regs->cntr |= DISPPLANE_TRICKLE_FEED_DISABLE;
 
 	linear_offset = fb->offsets[0] + y * fb->pitches[0] + x * (fb->bits_per_pixel / 8);
-	intel_crtc->dspaddr_offset =
-		gen4_compute_dspaddr_offset_xtiled(&x, &y,
-						   cpp, fb->pitches[0]);
+	intel_crtc->dspaddr_offset = gen4_compute_dspaddr_offset_xtiled(&x, &y, cpp, fb->pitches[0]);
 	linear_offset -= intel_crtc->dspaddr_offset;
 
 	DRM_DEBUG_KMS("Writing base %08X %08lX %d %d %d\n",
 		      obj->gtt_offset, linear_offset, x, y, fb->pitches[0]);
-	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
-	I915_MODIFY_DISPBASE(DSPSURF(plane),
-			     obj->gtt_offset + intel_crtc->dspaddr_offset);
-	I915_WRITE(DSPTILEOFF(plane), (y << 16) | x);
-	I915_WRITE(DSPLINOFF(plane), linear_offset);
-	POSTING_READ(reg);
+	regs->stride = fb->pitches[0];
+	regs->surf = I915_LO_DISPBASE(I915_READ(DSPSURF(plane)));
+	regs->surf |= obj->gtt_offset + intel_crtc->dspaddr_offset;
+	regs->tileoff = (y << 16) | x;
+	regs->linoff = linear_offset;
 
 	return 0;
 }
 
+static int ironlake_update_plane(struct drm_crtc *crtc,
+				 struct drm_framebuffer *fb, int x, int y)
+{
+	struct drm_i915_private *dev_priv = crtc->dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	int plane = intel_crtc->plane;
+	int ret;
+
+	ret = ironlake_calc_plane(crtc, fb, x, y);
+	if (ret)
+		return ret;
+
+	intel_plane_prepare(crtc);
+	intel_plane_commit(crtc);
+
+	POSTING_READ(DSPCNTR(plane));
+
+	return 0;
+}
+
+int intel_plane_calc(struct drm_crtc *crtc,
+		     struct drm_framebuffer *fb, int x, int y)
+{
+	if (HAS_PCH_SPLIT(crtc->dev))
+		return ironlake_calc_plane(crtc, fb, x, y);
+	else
+		return i9xx_calc_plane(crtc, fb, x, y);
+}
+
 /* Assume fb object is pinned & idle & fenced and just update base pointers */
 static int
 intel_pipe_set_base_atomic(struct drm_crtc *crtc, struct drm_framebuffer *fb,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index fa81676..52b0cc7 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -31,6 +31,7 @@ 
 #include "drm_crtc.h"
 #include "drm_crtc_helper.h"
 #include "drm_fb_helper.h"
+#include "drm_flip.h"
 
 #define _wait_for(COND, MS, W) ({ \
 	unsigned long timeout__ = jiffies + msecs_to_jiffies(MS);	\
@@ -162,6 +163,21 @@  struct intel_connector {
 	struct intel_encoder *encoder;
 };
 
+struct intel_plane_regs {
+	u32 cntr;
+	u32 addr;
+	u32 linoff;
+	u32 stride;
+	u32 pos;
+	u32 size;
+	u32 keyval;
+	u32 keymsk;
+	u32 surf;
+	u32 keymaxval;
+	u32 tileoff;
+	u32 scale;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -191,6 +207,12 @@  struct intel_crtc {
 
 	/* We can share PLLs across outputs if the timings match */
 	struct intel_pch_pll *pch_pll;
+
+	struct intel_plane_regs primary_regs;
+
+	struct drm_flip_helper flip_helper;
+	wait_queue_head_t vbl_wait;
+	bool vbl_received;
 };
 
 struct intel_plane_coords {
@@ -208,6 +230,7 @@  struct intel_plane {
 	struct drm_i915_gem_object *obj;
 	int max_downscale;
 	u32 lut_r[1024], lut_g[1024], lut_b[1024];
+	struct intel_plane_regs regs;
 	void (*update_plane)(struct drm_plane *plane,
 			     struct drm_framebuffer *fb,
 			     const struct intel_plane_coords *clip);
@@ -216,6 +239,7 @@  struct intel_plane {
 			       struct drm_intel_sprite_colorkey *key);
 	void (*get_colorkey)(struct drm_plane *plane,
 			     struct drm_intel_sprite_colorkey *key);
+	struct drm_flip_helper flip_helper;
 };
 
 struct intel_watermark_params {
@@ -525,4 +549,9 @@  extern void intel_crtc_cursor_commit(struct drm_crtc *crtc, uint32_t handle,
 				     struct drm_i915_gem_object *obj,
 				     uint32_t addr);
 
+extern int intel_calc_primary(struct drm_crtc *crtc,
+			    struct drm_framebuffer *fb, int x, int y);
+extern void intel_prepare_primary(struct drm_crtc *crtc);
+extern void intel_commit_primary(struct drm_crtc *crtc);
+
 #endif /* __INTEL_DRV_H__ */
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index d456cd2..3049a62 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -187,12 +187,10 @@  int intel_check_plane(const struct drm_plane *plane,
 }
 
 static void
-ivb_update_plane(struct drm_plane *plane,
-		 struct drm_framebuffer *fb,
-		 const struct intel_plane_coords *coords)
+ivb_calc_plane(struct drm_plane *plane,
+	       struct drm_framebuffer *fb,
+	       const struct intel_plane_coords *coords)
 {
-	struct drm_device *dev = plane->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_plane *intel_plane = to_intel_plane(plane);
 	const struct drm_i915_gem_object *obj = to_intel_framebuffer(fb)->obj;
 	int crtc_x = coords->crtc_x;
@@ -203,47 +201,48 @@  ivb_update_plane(struct drm_plane *plane,
 	uint32_t y = coords->src_y;
 	uint32_t src_w = coords->src_w;
 	uint32_t src_h = coords->src_h;
-	int pipe = intel_plane->pipe;
-	u32 sprctl, sprscale = 0;
 	int pixel_size = drm_format_plane_cpp(fb->pixel_format, 0);
-
-	sprctl = I915_READ(SPRCTL(pipe));
+	struct intel_plane_regs *regs = &intel_plane->regs;
 
 	/* Mask out pixel format bits in case we change it */
-	sprctl &= ~SPRITE_PIXFORMAT_MASK;
-	sprctl &= ~SPRITE_RGB_ORDER_RGBX;
-	sprctl &= ~SPRITE_YUV_BYTE_ORDER_MASK;
-	sprctl &= ~SPRITE_TILED;
+	regs->cntr &= ~(SPRITE_PIXFORMAT_MASK |
+			SPRITE_RGB_ORDER_RGBX |
+			SPRITE_YUV_BYTE_ORDER_MASK |
+			SPRITE_TRICKLE_FEED_DISABLE |
+			SPRITE_TILED |
+			SPRITE_ENABLE);
 
 	switch (fb->pixel_format) {
 	case DRM_FORMAT_XBGR8888:
-		sprctl |= SPRITE_FORMAT_RGBX888;
+		regs->cntr |= SPRITE_FORMAT_RGBX888;
 		break;
 	case DRM_FORMAT_XRGB8888:
-		sprctl |= SPRITE_FORMAT_RGBX888 | SPRITE_RGB_ORDER_RGBX;
+		regs->cntr |= SPRITE_FORMAT_RGBX888 | SPRITE_RGB_ORDER_RGBX;
 		break;
 	case DRM_FORMAT_YUYV:
-		sprctl |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_YUYV;
+		regs->cntr |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_YUYV;
 		break;
 	case DRM_FORMAT_YVYU:
-		sprctl |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_YVYU;
+		regs->cntr |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_YVYU;
 		break;
 	case DRM_FORMAT_UYVY:
-		sprctl |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_UYVY;
+		regs->cntr |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_UYVY;
 		break;
 	case DRM_FORMAT_VYUY:
-		sprctl |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_VYUY;
+		regs->cntr |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_VYUY;
 		break;
 	default:
 		BUG();
 	}
 
 	if (obj->tiling_mode != I915_TILING_NONE)
-		sprctl |= SPRITE_TILED;
+		regs->cntr |= SPRITE_TILED;
 
 	/* must disable */
-	sprctl |= SPRITE_TRICKLE_FEED_DISABLE;
-	sprctl |= SPRITE_ENABLE;
+	regs->cntr |= SPRITE_TRICKLE_FEED_DISABLE;
+
+	if (coords->visible)
+		regs->cntr |= SPRITE_ENABLE;
 
 	/* Sizes are 0 based */
 	src_w--;
@@ -251,20 +250,54 @@  ivb_update_plane(struct drm_plane *plane,
 	crtc_w--;
 	crtc_h--;
 
-	intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
+	/*
+	 * IVB workaround: must disable low power watermarks for at least
+	 * one frame before enabling scaling.  LP watermarks can be re-enabled
+	 * when scaling is disabled.
+	 */
+	if (coords->visible && (crtc_w != src_w || crtc_h != src_h))
+		regs->scale = SPRITE_SCALE_ENABLE | (src_w << 16) | src_h;
+	else
+		regs->scale = 0;
+
+	regs->stride = fb->pitches[0];
+	regs->pos =  (crtc_y << 16) | crtc_x;
+
+	if (obj->tiling_mode != I915_TILING_NONE) {
+		y += fb->offsets[0] / fb->pitches[0];
+		x += fb->offsets[0] % fb->pitches[0] / pixel_size;
+
+		regs->tileoff = (y << 16) | x;
+	} else
+		regs->linoff = fb->offsets[0] + y * fb->pitches[0] + x * pixel_size;
+
+	regs->size = (crtc_h << 16) | crtc_w;
+	regs->surf = I915_LO_DISPBASE(regs->surf) | obj->gtt_offset;
+}
+
+static void
+ivb_commit_plane(struct drm_plane *plane, struct drm_framebuffer *fb)
+{
+	struct drm_device *dev = plane->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_plane *intel_plane = to_intel_plane(plane);
+	int pipe = intel_plane->pipe;
+	int pixel_size = fb ? drm_format_plane_cpp(fb->pixel_format, 0) : 0;
+	const struct intel_plane_regs *regs = &intel_plane->regs;
+
+	intel_update_sprite_watermarks(dev, pipe, regs->size & 0xffff, pixel_size);
 
 	/*
 	 * IVB workaround: must disable low power watermarks for at least
 	 * one frame before enabling scaling.  LP watermarks can be re-enabled
 	 * when scaling is disabled.
 	 */
-	if (crtc_w != src_w || crtc_h != src_h) {
+	if (regs->scale & SPRITE_SCALE_ENABLE) {
 		if (!dev_priv->sprite_scaling_enabled) {
 			dev_priv->sprite_scaling_enabled = true;
 			intel_update_watermarks(dev);
 			intel_wait_for_vblank(dev, pipe);
 		}
-		sprscale = SPRITE_SCALE_ENABLE | (src_w << 16) | src_h;
 	} else {
 		if (dev_priv->sprite_scaling_enabled) {
 			dev_priv->sprite_scaling_enabled = false;
@@ -273,23 +306,31 @@  ivb_update_plane(struct drm_plane *plane,
 		}
 	}
 
-	I915_WRITE(SPRSTRIDE(pipe), fb->pitches[0]);
-	I915_WRITE(SPRPOS(pipe), (crtc_y << 16) | crtc_x);
-	if (obj->tiling_mode != I915_TILING_NONE) {
-		y += fb->offsets[0] / fb->pitches[0];
-		x += fb->offsets[0] % fb->pitches[0] / pixel_size;
+	I915_WRITE(SPRKEYVAL(pipe), regs->keyval);
+	I915_WRITE(SPRKEYMAX(pipe), regs->keymaxval);
+	I915_WRITE(SPRKEYMSK(pipe), regs->keymsk);
+	I915_WRITE(SPRSTRIDE(pipe), regs->stride);
+	I915_WRITE(SPRPOS(pipe), regs->pos);
+	I915_WRITE(SPRTILEOFF(pipe), regs->tileoff);
+	I915_WRITE(SPRLINOFF(pipe), regs->linoff);
+	I915_WRITE(SPRSIZE(pipe), regs->size);
+	I915_WRITE(SPRSCALE(pipe), regs->scale);
+	I915_WRITE(SPRCTL(pipe), regs->cntr);
+	I915_WRITE(SPRSURF(pipe), regs->surf);
+}
 
-		I915_WRITE(SPRTILEOFF(pipe), (y << 16) | x);
-	} else {
-		unsigned long offset;
+static void
+ivb_update_plane(struct drm_plane *plane,
+		 struct drm_framebuffer *fb,
+		 const struct intel_plane_coords *coords)
+{
+	struct drm_device *dev = plane->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int pipe = to_intel_plane(plane)->pipe;
+
+	ivb_calc_plane(plane, fb, coords);
+	ivb_commit_plane(plane, fb);
 
-		offset = fb->offsets[0] + y * fb->pitches[0] + x * pixel_size;
-		I915_WRITE(SPRLINOFF(pipe), offset);
-	}
-	I915_WRITE(SPRSIZE(pipe), (crtc_h << 16) | crtc_w);
-	I915_WRITE(SPRSCALE(pipe), sprscale);
-	I915_WRITE(SPRCTL(pipe), sprctl);
-	I915_MODIFY_DISPBASE(SPRSURF(pipe), obj->gtt_offset);
 	POSTING_READ(SPRSURF(pipe));
 }
 
@@ -300,12 +341,13 @@  ivb_disable_plane(struct drm_plane *plane)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_plane *intel_plane = to_intel_plane(plane);
 	int pipe = intel_plane->pipe;
+	struct intel_plane_regs *regs = &intel_plane->regs;
 
-	I915_WRITE(SPRCTL(pipe), I915_READ(SPRCTL(pipe)) & ~SPRITE_ENABLE);
+	regs->cntr &= ~SPRITE_ENABLE;
 	/* Can't leave the scaler enabled... */
-	I915_WRITE(SPRSCALE(pipe), 0);
-	/* Activate double buffered register update */
-	I915_MODIFY_DISPBASE(SPRSURF(pipe), 0);
+	regs->scale = 0;
+
+	ivb_commit_plane(plane, plane->fb);
 	POSTING_READ(SPRSURF(pipe));
 
 	dev_priv->sprite_scaling_enabled = false;
@@ -318,61 +360,50 @@  ivb_update_colorkey(struct drm_plane *plane,
 {
 	struct drm_device *dev = plane->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_plane *intel_plane;
-	u32 sprctl;
-	int ret = 0;
-
-	intel_plane = to_intel_plane(plane);
+	struct intel_plane *intel_plane = to_intel_plane(plane);
+	struct intel_plane_regs *regs = &intel_plane->regs;
 
-	I915_WRITE(SPRKEYVAL(intel_plane->pipe), key->min_value);
-	I915_WRITE(SPRKEYMAX(intel_plane->pipe), key->max_value);
-	I915_WRITE(SPRKEYMSK(intel_plane->pipe), key->channel_mask);
+	regs->keyval = key->min_value;
+	regs->keymaxval = key->max_value;
+	regs->keymsk = key->channel_mask;
 
-	sprctl = I915_READ(SPRCTL(intel_plane->pipe));
-	sprctl &= ~(SPRITE_SOURCE_KEY | SPRITE_DEST_KEY);
+	regs->cntr &= ~(SPRITE_SOURCE_KEY | SPRITE_DEST_KEY);
 	if (key->flags & I915_SET_COLORKEY_DESTINATION)
-		sprctl |= SPRITE_DEST_KEY;
+		regs->cntr |= SPRITE_DEST_KEY;
 	else if (key->flags & I915_SET_COLORKEY_SOURCE)
-		sprctl |= SPRITE_SOURCE_KEY;
-	I915_WRITE(SPRCTL(intel_plane->pipe), sprctl);
+		regs->cntr |= SPRITE_SOURCE_KEY;
 
+	ivb_commit_plane(plane, plane->fb);
 	POSTING_READ(SPRKEYMSK(intel_plane->pipe));
 
-	return ret;
+	return 0;
 }
 
 static void
 ivb_get_colorkey(struct drm_plane *plane, struct drm_intel_sprite_colorkey *key)
 {
-	struct drm_device *dev = plane->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_plane *intel_plane;
-	u32 sprctl;
-
-	intel_plane = to_intel_plane(plane);
+	struct intel_plane *intel_plane = to_intel_plane(plane);
+	const struct intel_plane_regs *regs = &intel_plane->regs;
 
-	key->min_value = I915_READ(SPRKEYVAL(intel_plane->pipe));
-	key->max_value = I915_READ(SPRKEYMAX(intel_plane->pipe));
-	key->channel_mask = I915_READ(SPRKEYMSK(intel_plane->pipe));
+	key->min_value = regs->keyval;
+	key->max_value = regs->keymaxval;
+	key->channel_mask = regs->keymsk;
 	key->flags = 0;
 
-	sprctl = I915_READ(SPRCTL(intel_plane->pipe));
-
-	if (sprctl & SPRITE_DEST_KEY)
+	if (regs->cntr & SPRITE_DEST_KEY)
 		key->flags = I915_SET_COLORKEY_DESTINATION;
-	else if (sprctl & SPRITE_SOURCE_KEY)
+	else if (regs->cntr & SPRITE_SOURCE_KEY)
 		key->flags = I915_SET_COLORKEY_SOURCE;
 	else
 		key->flags = I915_SET_COLORKEY_NONE;
 }
 
 static void
-ilk_update_plane(struct drm_plane *plane,
-		 struct drm_framebuffer *fb,
-		 const struct intel_plane_coords *coords)
+ilk_calc_plane(struct drm_plane *plane,
+	       struct drm_framebuffer *fb,
+	       const struct intel_plane_coords *coords)
 {
 	struct drm_device *dev = plane->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_plane *intel_plane = to_intel_plane(plane);
 	const struct drm_i915_gem_object *obj = to_intel_framebuffer(fb)->obj;
 	int crtc_x = coords->crtc_x;
@@ -383,46 +414,48 @@  ilk_update_plane(struct drm_plane *plane,
 	uint32_t y = coords->src_y;
 	uint32_t src_w = coords->src_w;
 	uint32_t src_h = coords->src_h;
-	int pipe = intel_plane->pipe;
-	u32 dvscntr, dvsscale;
 	int pixel_size = drm_format_plane_cpp(fb->pixel_format, 0);
-
-	dvscntr = I915_READ(DVSCNTR(pipe));
+	struct intel_plane_regs *regs = &intel_plane->regs;
 
 	/* Mask out pixel format bits in case we change it */
-	dvscntr &= ~DVS_PIXFORMAT_MASK;
-	dvscntr &= ~DVS_RGB_ORDER_XBGR;
-	dvscntr &= ~DVS_YUV_BYTE_ORDER_MASK;
+	regs->cntr &= ~(DVS_PIXFORMAT_MASK |
+			DVS_RGB_ORDER_XBGR |
+			DVS_YUV_BYTE_ORDER_MASK |
+			DVS_TRICKLE_FEED_DISABLE |
+			DVS_TILED |
+			DVS_ENABLE);
 
 	switch (fb->pixel_format) {
 	case DRM_FORMAT_XBGR8888:
-		dvscntr |= DVS_FORMAT_RGBX888 | DVS_RGB_ORDER_XBGR;
+		regs->cntr |= DVS_FORMAT_RGBX888 | DVS_RGB_ORDER_XBGR;
 		break;
 	case DRM_FORMAT_XRGB8888:
-		dvscntr |= DVS_FORMAT_RGBX888;
+		regs->cntr |= DVS_FORMAT_RGBX888;
 		break;
 	case DRM_FORMAT_YUYV:
-		dvscntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_YUYV;
+		regs->cntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_YUYV;
 		break;
 	case DRM_FORMAT_YVYU:
-		dvscntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_YVYU;
+		regs->cntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_YVYU;
 		break;
 	case DRM_FORMAT_UYVY:
-		dvscntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_UYVY;
+		regs->cntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_UYVY;
 		break;
 	case DRM_FORMAT_VYUY:
-		dvscntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_VYUY;
+		regs->cntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_VYUY;
 		break;
 	default:
 		BUG();
 	}
 
 	if (obj->tiling_mode != I915_TILING_NONE)
-		dvscntr |= DVS_TILED;
+		regs->cntr |= DVS_TILED;
 
 	if (IS_GEN6(dev))
-		dvscntr |= DVS_TRICKLE_FEED_DISABLE; /* must disable */
-	dvscntr |= DVS_ENABLE;
+		regs->cntr |= DVS_TRICKLE_FEED_DISABLE; /* must disable */
+
+	if (coords->visible)
+		regs->cntr |= DVS_ENABLE;
 
 	/* Sizes are 0 based */
 	src_w--;
@@ -430,29 +463,64 @@  ilk_update_plane(struct drm_plane *plane,
 	crtc_w--;
 	crtc_h--;
 
-	intel_update_sprite_watermarks(dev, pipe, crtc_w, pixel_size);
+	if (coords->visible && (IS_GEN5(dev) || crtc_w != src_w || crtc_h != src_h))
+		regs->scale = DVS_SCALE_ENABLE | (src_w << 16) | src_h;
+	else
+		regs->scale = 0;
 
-	dvsscale = 0;
-	if (IS_GEN5(dev) || crtc_w != src_w || crtc_h != src_h)
-		dvsscale = DVS_SCALE_ENABLE | (src_w << 16) | src_h;
+	regs->stride = fb->pitches[0];
+	regs->pos = (crtc_y << 16) | crtc_x;
 
-	I915_WRITE(DVSSTRIDE(pipe), fb->pitches[0]);
-	I915_WRITE(DVSPOS(pipe), (crtc_y << 16) | crtc_x);
 	if (obj->tiling_mode != I915_TILING_NONE) {
 		y += fb->offsets[0] / fb->pitches[0];
 		x += fb->offsets[0] % fb->pitches[0] / pixel_size;
 
-		I915_WRITE(DVSTILEOFF(pipe), (y << 16) | x);
-	} else {
-		unsigned long offset;
+		regs->tileoff = (y << 16) | x;
+	} else
+		regs->linoff = fb->offsets[0] + y * fb->pitches[0] + x * pixel_size;
+
+	regs->size = (crtc_h << 16) | crtc_w;
+	regs->surf = I915_LO_DISPBASE(regs->surf) | obj->gtt_offset;
+}
+
+static void
+ilk_commit_plane(struct drm_plane *plane, struct drm_framebuffer *fb)
+{
+	struct drm_device *dev = plane->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_plane *intel_plane = to_intel_plane(plane);
+	int pipe = intel_plane->pipe;
+	int pixel_size = fb ? drm_format_plane_cpp(fb->pixel_format, 0) : 0;
+	const struct intel_plane_regs *regs = &intel_plane->regs;
+
+	/* FIXME move somewhere appropriate */
+	intel_update_sprite_watermarks(dev, pipe, regs->size & 0xffff, pixel_size);
+
+	I915_WRITE(DVSKEYVAL(pipe), regs->keyval);
+	I915_WRITE(DVSKEYMAX(pipe), regs->keymaxval);
+	I915_WRITE(DVSKEYMSK(pipe), regs->keymsk);
+	I915_WRITE(DVSSTRIDE(pipe), regs->stride);
+	I915_WRITE(DVSPOS(pipe), regs->pos);
+	I915_WRITE(DVSTILEOFF(pipe), regs->tileoff);
+	I915_WRITE(DVSLINOFF(pipe), regs->linoff);
+	I915_WRITE(DVSSIZE(pipe), regs->size);
+	I915_WRITE(DVSSCALE(pipe), regs->scale);
+	I915_WRITE(DVSCNTR(pipe), regs->cntr);
+	I915_WRITE(DVSSURF(pipe), regs->surf);
+}
+
+static void
+ilk_update_plane(struct drm_plane *plane,
+		 struct drm_framebuffer *fb,
+		 const struct intel_plane_coords *coords)
+{
+	struct drm_device *dev = plane->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int pipe = to_intel_plane(plane)->pipe;
+
+	ilk_calc_plane(plane, fb, coords);
+	ilk_commit_plane(plane, fb);
 
-		offset = fb->offsets[0] + y * fb->pitches[0] + x * pixel_size;
-		I915_WRITE(DVSLINOFF(pipe), offset);
-	}
-	I915_WRITE(DVSSIZE(pipe), (crtc_h << 16) | crtc_w);
-	I915_WRITE(DVSSCALE(pipe), dvsscale);
-	I915_WRITE(DVSCNTR(pipe), dvscntr);
-	I915_MODIFY_DISPBASE(DVSSURF(pipe), obj->gtt_offset);
 	POSTING_READ(DVSSURF(pipe));
 }
 
@@ -463,22 +531,50 @@  ilk_disable_plane(struct drm_plane *plane)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_plane *intel_plane = to_intel_plane(plane);
 	int pipe = intel_plane->pipe;
+	struct intel_plane_regs *regs = &intel_plane->regs;
 
-	I915_WRITE(DVSCNTR(pipe), I915_READ(DVSCNTR(pipe)) & ~DVS_ENABLE);
+	regs->cntr &= ~DVS_ENABLE;
 	/* Disable the scaler */
-	I915_WRITE(DVSSCALE(pipe), 0);
-	/* Flush double buffered register updates */
-	I915_MODIFY_DISPBASE(DVSSURF(pipe), 0);
+	regs->scale = 0;
+
+	ilk_commit_plane(plane, NULL);
 	POSTING_READ(DVSSURF(pipe));
 }
 
+
+void intel_sprite_calc(struct drm_plane *plane, struct drm_framebuffer *fb,
+		       const struct intel_plane_coords *coords)
+{
+	struct drm_device *dev = plane->dev;
+
+	if (INTEL_INFO(dev)->gen == 7)
+		ivb_calc_plane(plane, fb, coords);
+	else
+		ilk_calc_plane(plane, fb, coords);
+}
+
+void intel_sprite_prepare(struct drm_plane *plane)
+{
+}
+
+void intel_sprite_commit(struct drm_plane *plane)
+{
+	struct drm_device *dev = plane->dev;
+
+	if (INTEL_INFO(dev)->gen == 7)
+		ivb_commit_plane(plane, plane->fb);
+	else
+		ilk_commit_plane(plane, plane->fb);
+}
+
+void intel_plane_commit(struct drm_crtc *crtc);
+
 static void
 intel_enable_primary(struct drm_crtc *crtc)
 {
 	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	int reg = DSPCNTR(intel_crtc->plane);
+	struct intel_plane_regs *regs = &intel_crtc->primary_regs;
 
 	if (!intel_crtc->primary_disabled)
 		return;
@@ -486,21 +582,26 @@  intel_enable_primary(struct drm_crtc *crtc)
 	intel_crtc->primary_disabled = false;
 	intel_update_fbc(dev);
 
-	I915_WRITE(reg, I915_READ(reg) | DISPLAY_PLANE_ENABLE);
+	regs->cntr |= DISPLAY_PLANE_ENABLE;
+
+	// FIXME
+	intel_plane_commit(crtc);
 }
 
 static void
 intel_disable_primary(struct drm_crtc *crtc)
 {
 	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	int reg = DSPCNTR(intel_crtc->plane);
+	struct intel_plane_regs *regs = &intel_crtc->primary_regs;
 
 	if (intel_crtc->primary_disabled)
 		return;
 
-	I915_WRITE(reg, I915_READ(reg) & ~DISPLAY_PLANE_ENABLE);
+	regs->cntr &= ~DISPLAY_PLANE_ENABLE;
+
+	// FIXME
+	intel_plane_commit(crtc);
 
 	intel_crtc->primary_disabled = true;
 	intel_update_fbc(dev);
@@ -512,49 +613,39 @@  ilk_update_colorkey(struct drm_plane *plane,
 {
 	struct drm_device *dev = plane->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_plane *intel_plane;
-	u32 dvscntr;
-	int ret = 0;
-
-	intel_plane = to_intel_plane(plane);
+	struct intel_plane *intel_plane = to_intel_plane(plane);
+	struct intel_plane_regs *regs = &intel_plane->regs;
 
-	I915_WRITE(DVSKEYVAL(intel_plane->pipe), key->min_value);
-	I915_WRITE(DVSKEYMAX(intel_plane->pipe), key->max_value);
-	I915_WRITE(DVSKEYMSK(intel_plane->pipe), key->channel_mask);
+	regs->keyval = key->min_value;
+	regs->keymaxval = key->max_value;
+	regs->keymsk = key->channel_mask;
 
-	dvscntr = I915_READ(DVSCNTR(intel_plane->pipe));
-	dvscntr &= ~(DVS_SOURCE_KEY | DVS_DEST_KEY);
+	regs->cntr &= ~(DVS_SOURCE_KEY | DVS_DEST_KEY);
 	if (key->flags & I915_SET_COLORKEY_DESTINATION)
-		dvscntr |= DVS_DEST_KEY;
+		regs->cntr |= DVS_DEST_KEY;
 	else if (key->flags & I915_SET_COLORKEY_SOURCE)
-		dvscntr |= DVS_SOURCE_KEY;
-	I915_WRITE(DVSCNTR(intel_plane->pipe), dvscntr);
+		regs->cntr |= DVS_SOURCE_KEY;
 
+	ilk_commit_plane(plane, plane->fb);
 	POSTING_READ(DVSKEYMSK(intel_plane->pipe));
 
-	return ret;
+	return 0;
 }
 
 static void
 ilk_get_colorkey(struct drm_plane *plane, struct drm_intel_sprite_colorkey *key)
 {
-	struct drm_device *dev = plane->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct intel_plane *intel_plane;
-	u32 dvscntr;
-
-	intel_plane = to_intel_plane(plane);
+	struct intel_plane *intel_plane = to_intel_plane(plane);
+	const struct intel_plane_regs *regs = &intel_plane->regs;
 
-	key->min_value = I915_READ(DVSKEYVAL(intel_plane->pipe));
-	key->max_value = I915_READ(DVSKEYMAX(intel_plane->pipe));
-	key->channel_mask = I915_READ(DVSKEYMSK(intel_plane->pipe));
+	key->min_value = regs->keyval;
+	key->max_value = regs->keymaxval;
+	key->channel_mask = regs->keymsk;
 	key->flags = 0;
 
-	dvscntr = I915_READ(DVSCNTR(intel_plane->pipe));
-
-	if (dvscntr & DVS_DEST_KEY)
+	if (regs->cntr & DVS_DEST_KEY)
 		key->flags = I915_SET_COLORKEY_DESTINATION;
-	else if (dvscntr & DVS_SOURCE_KEY)
+	else if (regs->cntr & DVS_SOURCE_KEY)
 		key->flags = I915_SET_COLORKEY_SOURCE;
 	else
 		key->flags = I915_SET_COLORKEY_NONE;
@@ -803,6 +894,7 @@  static uint32_t snb_plane_formats[] = {
 int
 intel_plane_init(struct drm_device *dev, enum pipe pipe)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_plane *intel_plane;
 	unsigned long possible_crtcs;
 	const uint32_t *plane_formats;
@@ -825,6 +917,9 @@  intel_plane_init(struct drm_device *dev, enum pipe pipe)
 		intel_plane->update_colorkey = ilk_update_colorkey;
 		intel_plane->get_colorkey = ilk_get_colorkey;
 
+		intel_plane->regs.cntr = I915_READ(DVSCNTR(pipe));
+		intel_plane->regs.surf = I915_READ(DVSSURF(pipe));
+
 		if (IS_GEN6(dev)) {
 			plane_formats = snb_plane_formats;
 			num_plane_formats = ARRAY_SIZE(snb_plane_formats);
@@ -841,6 +936,9 @@  intel_plane_init(struct drm_device *dev, enum pipe pipe)
 		intel_plane->update_colorkey = ivb_update_colorkey;
 		intel_plane->get_colorkey = ivb_get_colorkey;
 
+		intel_plane->regs.cntr = I915_READ(SPRCTL(pipe));
+		intel_plane->regs.surf = I915_READ(SPRSURF(pipe));
+
 		plane_formats = snb_plane_formats;
 		num_plane_formats = ARRAY_SIZE(snb_plane_formats);
 		break;