diff mbox

[29/30] drm/i915: Track fence setup separately from fenced object lifetime

Message ID 1302640318-23165-30-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson April 12, 2011, 8:31 p.m. UTC
This fixes a bookkeeping error causing an OOPS whilst waiting for an
object to finish using a fence. Now we can simply wait for the fence to
be written independent of the objects currently inhabiting it (past,
present and future).

A large amount of the change is to delay updating the information about
the fence on bo until after we successfully write, or queue the write to,
the register. This avoids the complication of undoing a partial change
should we fail in pipelining the change.

Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h |    1 +
 drivers/gpu/drm/i915/i915_gem.c |  155 ++++++++++++++++++++-------------------
 2 files changed, 82 insertions(+), 74 deletions(-)

Comments

Daniel Vetter April 13, 2011, 8:42 p.m. UTC | #1
On Tue, Apr 12, 2011 at 09:31:57PM +0100, Chris Wilson wrote:
> This fixes a bookkeeping error causing an OOPS whilst waiting for an
> object to finish using a fence. Now we can simply wait for the fence to
> be written independent of the objects currently inhabiting it (past,
> present and future).
> 
> A large amount of the change is to delay updating the information about
> the fence on bo until after we successfully write, or queue the write to,
> the register. This avoids the complication of undoing a partial change
> should we fail in pipelining the change.
> 
> Cc: Andy Whitcroft <apw@canonical.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I think that r-b is stale ;-) Still holds though for the general idea. A
few nitpicks below.

On general comment: I think we should get completely rid of
last_fenced_ring. There should be no way an object can change rings
without being at least completely flushed (or even going through the
inactive list).  Maybe that's for a separate patch but I'm slightly uneasy
with the fact that we don't seem to systematically clear last_fenced_ring
_anywhere_.

> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ca14a86..1949048 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1731,6 +1731,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  	i915_gem_object_move_off_active(obj);
>  	obj->fenced_gpu_access = false;
>  
> +	obj->last_fenced_seqno = 0;
> +

I think we could move that to move_off_active where last_rendering_seqno
is being reset. Would be slightly more consistent. Resetting
last_fenced_ring together with last_fenced_seqno probably makes sens, too.

> @@ -2675,47 +2661,43 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
>  	if (reg == NULL)
>  		return -ENOSPC;
>  
> -	ret = i915_gem_object_flush_fence(obj, pipelined);
> -	if (ret)
> -		return ret;
> -
> -	if (reg->obj) {
> -		struct drm_i915_gem_object *old = reg->obj;
> -
> +	if ((old = reg->obj)) {

Argh. Can you move the assignment out?

> @@ -2732,7 +2714,31 @@ update:
>  		ret = i830_write_fence_reg(obj, pipelined, regnum);
>  		break;
>  	}
> +	if (ret)
> +		goto err;
> +
> +	if (pipelined) {
> +		reg->setup_seqno = i915_gem_next_request_seqno(pipelined);
> +		reg->setup_ring = pipelined;
> +		if (old) {
> +			old->last_fenced_ring = pipelined;
> +			old->last_fenced_seqno = reg->setup_seqno;
> +		}

This looks superfluous. flush_fence should take care of this either
directly or via flush_ring -> process_flushing_list -> move_to_active.
If it's just paranoia, can this be converted to a WARN_ON? Or is this
closing a gap I'm not seeing?
-Daniel
Chris Wilson April 13, 2011, 9:56 p.m. UTC | #2
On Wed, 13 Apr 2011 22:42:23 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Apr 12, 2011 at 09:31:57PM +0100, Chris Wilson wrote:
> > This fixes a bookkeeping error causing an OOPS whilst waiting for an
> > object to finish using a fence. Now we can simply wait for the fence to
> > be written independent of the objects currently inhabiting it (past,
> > present and future).
> > 
> > A large amount of the change is to delay updating the information about
> > the fence on bo until after we successfully write, or queue the write to,
> > the register. This avoids the complication of undoing a partial change
> > should we fail in pipelining the change.
> > 
> > Cc: Andy Whitcroft <apw@canonical.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> I think that r-b is stale ;-) Still holds though for the general idea. A
> few nitpicks below.

Meh, reviewers are fickle. I'm pretty sure I have not changed the code
from since the last time I put it in front of you. Much. ;-)

> On general comment: I think we should get completely rid of
> last_fenced_ring. There should be no way an object can change rings
> without being at least completely flushed (or even going through the
> inactive list).  Maybe that's for a separate patch but I'm slightly uneasy
> with the fact that we don't seem to systematically clear last_fenced_ring
> _anywhere_.

Ah. That was to make sure you were paying attention. last_fenced_seqno was
the guard.

last_fenced_ring is the complexity that holds it all together sadly. Every
time I try to eliminate it, I keep coming back to it as the cleanest
solution.

> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index ca14a86..1949048 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1731,6 +1731,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> >  	i915_gem_object_move_off_active(obj);
> >  	obj->fenced_gpu_access = false;
> >  
> > +	obj->last_fenced_seqno = 0;
> > +
> 
> I think we could move that to move_off_active where last_rendering_seqno
> is being reset. Would be slightly more consistent. Resetting
> last_fenced_ring together with last_fenced_seqno probably makes sens, too.

Right, the choice of setting last_fenced_seqno to 0 in move_off_active() or
move_to_inactive() doesn't impact upon flush_fence.

> > @@ -2675,47 +2661,43 @@ i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
> >  	if (reg == NULL)
> >  		return -ENOSPC;
> >  
> > -	ret = i915_gem_object_flush_fence(obj, pipelined);
> > -	if (ret)
> > -		return ret;
> > -
> > -	if (reg->obj) {
> > -		struct drm_i915_gem_object *old = reg->obj;
> > -
> > +	if ((old = reg->obj)) {
> 
> Argh. Can you move the assignment out?

Must remember to use this trick of point in eyesores to distract from the
rest of the code!

> > @@ -2732,7 +2714,31 @@ update:
> >  		ret = i830_write_fence_reg(obj, pipelined, regnum);
> >  		break;
> >  	}
> > +	if (ret)
> > +		goto err;
> > +
> > +	if (pipelined) {
> > +		reg->setup_seqno = i915_gem_next_request_seqno(pipelined);
> > +		reg->setup_ring = pipelined;
> > +		if (old) {
> > +			old->last_fenced_ring = pipelined;
> > +			old->last_fenced_seqno = reg->setup_seqno;
> > +		}
> 
> This looks superfluous. flush_fence should take care of this either
> directly or via flush_ring -> process_flushing_list -> move_to_active.
> If it's just paranoia, can this be converted to a WARN_ON? Or is this
> closing a gap I'm not seeing?

Oh, this is absolutely vital. Too tired, and this is definitely one that
has to be explained whilst fresh.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c837e10..d1fadb8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -129,6 +129,7 @@  struct drm_i915_master_private {
 struct drm_i915_fence_reg {
 	struct list_head lru_list;
 	struct drm_i915_gem_object *obj;
+	struct intel_ring_buffer *setup_ring;
 	uint32_t setup_seqno;
 };
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ca14a86..1949048 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1731,6 +1731,8 @@  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 	i915_gem_object_move_off_active(obj);
 	obj->fenced_gpu_access = false;
 
+	obj->last_fenced_seqno = 0;
+
 	obj->active = 0;
 	obj->pending_gpu_write = false;
 	drm_gem_object_unreference(&obj->base);
@@ -1896,7 +1898,6 @@  static void i915_gem_reset_fences(struct drm_device *dev)
 		reg->obj->fence_reg = I915_FENCE_REG_NONE;
 		reg->obj->fenced_gpu_access = false;
 		reg->obj->last_fenced_seqno = 0;
-		reg->obj->last_fenced_ring = NULL;
 		i915_gem_clear_fence_reg(dev, reg);
 	}
 }
@@ -2497,7 +2498,7 @@  i915_gem_object_flush_fence(struct drm_i915_gem_object *obj,
 
 	if (obj->fenced_gpu_access) {
 		if (obj->base.write_domain & I915_GEM_GPU_DOMAINS) {
-			ret = i915_gem_flush_ring(obj->last_fenced_ring,
+			ret = i915_gem_flush_ring(obj->ring,
 						  0, obj->base.write_domain);
 			if (ret)
 				return ret;
@@ -2508,17 +2509,22 @@  i915_gem_object_flush_fence(struct drm_i915_gem_object *obj,
 		obj->fenced_gpu_access = false;
 	}
 
+	if (obj->last_fenced_seqno &&
+	    ring_passed_seqno(obj->last_fenced_ring, obj->last_fenced_seqno))
+		obj->last_fenced_seqno = 0;
+
 	if (obj->last_fenced_seqno && pipelined != obj->last_fenced_ring) {
-		if (!ring_passed_seqno(obj->last_fenced_ring,
-				       obj->last_fenced_seqno)) {
-			ret = i915_wait_request(obj->last_fenced_ring,
-						obj->last_fenced_seqno);
-			if (ret)
-				return ret;
-		}
+		ret = i915_wait_request(obj->last_fenced_ring,
+					obj->last_fenced_seqno);
+		if (ret)
+			return ret;
 
+		/* Since last_fence_seqno can retire much earlier than
+		 * last_rendering_seqno, we track that here for efficiency.
+		 * (With a catch-all in move_to_inactive() to prevent very
+		 * old seqno from lying around.)
+		 */
 		obj->last_fenced_seqno = 0;
-		obj->last_fenced_ring = NULL;
 	}
 
 	/* Ensure that all CPU reads are completed before installing a fence
@@ -2585,7 +2591,7 @@  i915_find_fence_reg(struct drm_device *dev,
 			first = reg;
 
 		if (!pipelined ||
-		    !reg->obj->last_fenced_ring ||
+		    !reg->obj->last_fenced_seqno ||
 		    reg->obj->last_fenced_ring == pipelined) {
 			avail = reg;
 			break;
@@ -2602,7 +2608,6 @@  i915_find_fence_reg(struct drm_device *dev,
  * i915_gem_object_get_fence - set up a fence reg for an object
  * @obj: object to map through a fence reg
  * @pipelined: ring on which to queue the change, or NULL for CPU access
- * @interruptible: must we wait uninterruptibly for the register to retire?
  *
  * When mapping objects through the GTT, userspace wants to be able to write
  * to them without having to worry about swizzling if the object is tiled.
@@ -2610,6 +2615,10 @@  i915_find_fence_reg(struct drm_device *dev,
  * This function walks the fence regs looking for a free one for @obj,
  * stealing one if it can't find any.
  *
+ * Note: if two fence registers point to the same or overlapping memory region
+ * the results are undefined. This is even more fun with asynchronous updates
+ * via the GPU!
+ *
  * It then sets up the reg based on the object's properties: address, pitch
  * and tiling format.
  */
@@ -2620,6 +2629,7 @@  i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_fence_reg *reg;
+	struct drm_i915_gem_object *old = NULL;
 	int regnum;
 	int ret;
 
@@ -2629,45 +2639,21 @@  i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
 	/* Just update our place in the LRU if our fence is getting reused. */
 	if (obj->fence_reg != I915_FENCE_REG_NONE) {
 		reg = &dev_priv->fence_regs[obj->fence_reg];
-		list_move_tail(&reg->lru_list, &dev_priv->mm.fence_list);
-
-		if (obj->tiling_changed) {
-			ret = i915_gem_object_flush_fence(obj, pipelined);
-			if (ret)
-				return ret;
-
-			if (!obj->fenced_gpu_access && !obj->last_fenced_seqno)
-				pipelined = NULL;
-
-			if (pipelined) {
-				reg->setup_seqno =
-					i915_gem_next_request_seqno(pipelined);
-				obj->last_fenced_seqno = reg->setup_seqno;
-				obj->last_fenced_ring = pipelined;
-			}
 
+		if (obj->tiling_changed)
 			goto update;
-		}
-
-		if (!pipelined) {
-			if (reg->setup_seqno) {
-				if (!ring_passed_seqno(obj->last_fenced_ring,
-						       reg->setup_seqno)) {
-					ret = i915_wait_request(obj->last_fenced_ring,
-								reg->setup_seqno);
-					if (ret)
-						return ret;
-				}
 
-				reg->setup_seqno = 0;
-			}
-		} else if (obj->last_fenced_ring &&
-			   obj->last_fenced_ring != pipelined) {
-			ret = i915_gem_object_flush_fence(obj, pipelined);
+		if (reg->setup_seqno && pipelined != reg->setup_ring) {
+			ret = i915_wait_request(reg->setup_ring,
+						reg->setup_seqno);
 			if (ret)
 				return ret;
+
+			reg->setup_ring = 0;
+			reg->setup_seqno = 0;
 		}
 
+		list_move_tail(&reg->lru_list, &dev_priv->mm.fence_list);
 		return 0;
 	}
 
@@ -2675,47 +2661,43 @@  i915_gem_object_get_fence(struct drm_i915_gem_object *obj,
 	if (reg == NULL)
 		return -ENOSPC;
 
-	ret = i915_gem_object_flush_fence(obj, pipelined);
-	if (ret)
-		return ret;
-
-	if (reg->obj) {
-		struct drm_i915_gem_object *old = reg->obj;
-
+	if ((old = reg->obj)) {
 		drm_gem_object_reference(&old->base);
 
 		if (old->tiling_mode)
 			i915_gem_release_mmap(old);
 
-		ret = i915_gem_object_flush_fence(old, pipelined);
-		if (ret) {
-			drm_gem_object_unreference(&old->base);
-			return ret;
-		}
+		ret = i915_gem_object_flush_fence(old, NULL); //pipelined);
+		if (ret)
+			goto err;
 
-		if (old->last_fenced_seqno == 0 && obj->last_fenced_seqno == 0)
-			pipelined = NULL;
+		/* Mark the fence register as in-use if pipelined */
+		reg->setup_ring = old->last_fenced_ring;
+		reg->setup_seqno = old->last_fenced_seqno;
+	}
 
-		old->fence_reg = I915_FENCE_REG_NONE;
-		old->last_fenced_ring = pipelined;
-		old->last_fenced_seqno =
-			pipelined ? i915_gem_next_request_seqno(pipelined) : 0;
+update:
+	ret = i915_gem_object_flush_fence(obj, pipelined);
+	if (ret)
+		goto err;
 
-		drm_gem_object_unreference(&old->base);
-	} else if (obj->last_fenced_seqno == 0)
-		pipelined = NULL;
+	if (reg->setup_seqno && pipelined != reg->setup_ring) {
+		ret = i915_wait_request(reg->setup_ring,
+					reg->setup_seqno);
+		if (ret)
+			goto err;
 
-	reg->obj = obj;
-	list_move_tail(&reg->lru_list, &dev_priv->mm.fence_list);
-	obj->fence_reg = reg - dev_priv->fence_regs;
-	obj->last_fenced_ring = pipelined;
+		reg->setup_ring = 0;
+		reg->setup_seqno = 0;
+	}
 
-	reg->setup_seqno =
-		pipelined ? i915_gem_next_request_seqno(pipelined) : 0;
-	obj->last_fenced_seqno = reg->setup_seqno;
+	/* If we had a pipelined request, but there is no pending GPU access or
+	 * update to a fence register for this memory region, we can write
+	 * the new fence register immediately.
+	 */
+	if (obj->last_fenced_seqno == 0 && reg->setup_seqno == 0)
+		pipelined = NULL;
 
-update:
-	obj->tiling_changed = false;
 	regnum = reg - dev_priv->fence_regs;
 	switch (INTEL_INFO(dev)->gen) {
 	case 6:
@@ -2732,7 +2714,31 @@  update:
 		ret = i830_write_fence_reg(obj, pipelined, regnum);
 		break;
 	}
+	if (ret)
+		goto err;
+
+	if (pipelined) {
+		reg->setup_seqno = i915_gem_next_request_seqno(pipelined);
+		reg->setup_ring = pipelined;
+		if (old) {
+			old->last_fenced_ring = pipelined;
+			old->last_fenced_seqno = reg->setup_seqno;
+		}
+	}
+
+	if (old) {
+		old->fence_reg = I915_FENCE_REG_NONE;
+		drm_gem_object_unreference(&old->base);
+	}
+
+	list_move_tail(&reg->lru_list, &dev_priv->mm.fence_list);
+	reg->obj = obj;
+	obj->fence_reg = regnum;
+	obj->tiling_changed = false;
+	return 0;
 
+err:
+	drm_gem_object_unreference(&old->base);
 	return ret;
 }
 
@@ -2771,6 +2777,7 @@  i915_gem_clear_fence_reg(struct drm_device *dev,
 
 	list_del_init(&reg->lru_list);
 	reg->obj = NULL;
+	reg->setup_ring = 0;
 	reg->setup_seqno = 0;
 }