diff mbox

drm/i915: fix up error cleanup in i915_gem_object_bind_to_gtt

Message ID 1374487958-18999-1-git-send-email-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Vetter July 22, 2013, 10:12 a.m. UTC
This has been broken in

commit 2f63315692b1d3c055972ad33fc7168ae908b97b
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Wed Jul 17 12:19:03 2013 -0700

    drm/i915: Create VMAs

which resulted in an OOPS the first time around we've hit -ENOSPC.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Tested-by: meng <mengmeng.meng@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

Comments

Chris Wilson July 22, 2013, 10:34 a.m. UTC | #1
On Mon, Jul 22, 2013 at 12:12:38PM +0200, Daniel Vetter wrote:
> This has been broken in
> 
> commit 2f63315692b1d3c055972ad33fc7168ae908b97b
> Author: Ben Widawsky <ben@bwidawsk.net>
> Date:   Wed Jul 17 12:19:03 2013 -0700
> 
>     drm/i915: Create VMAs
> 
> which resulted in an OOPS the first time around we've hit -ENOSPC.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Tested-by: meng <mengmeng.meng@intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I tried and failed to demonstrate that this was overkill.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Ben Widawsky July 22, 2013, 8:12 p.m. UTC | #2
On Mon, Jul 22, 2013 at 12:12:38PM +0200, Daniel Vetter wrote:
> This has been broken in
> 
> commit 2f63315692b1d3c055972ad33fc7168ae908b97b
> Author: Ben Widawsky <ben@bwidawsk.net>
> Date:   Wed Jul 17 12:19:03 2013 -0700
> 
>     drm/i915: Create VMAs
> 
> which resulted in an OOPS the first time around we've hit -ENOSPC.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Tested-by: meng <mengmeng.meng@intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index cfa6588..c87a6ec 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3121,8 +3121,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  
>  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
>  	if (IS_ERR(vma)) {
> -		i915_gem_object_unpin_pages(obj);
> -		return PTR_ERR(vma);
> +		ret = PTR_ERR(vma);
> +		goto err_unpin;
>  	}

Adding the extra goto here seems pointless to me.

>  
>  search_free:
> @@ -3138,17 +3138,17 @@ search_free:
>  		if (ret == 0)
>  			goto search_free;
>  
> -		goto err_out;
> +		goto err_free_vma;
>  	}

My preference would be to exit early in drm_mm_remove_node() if the node
isn't allocated. I think at least we should add a WARN to
drm_mm_remove_node if the node->allocated == 0.

>  	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
>  					      obj->cache_level))) {
>  		ret = -EINVAL;
> -		goto err_out;
> +		goto err_remove_node;
>  	}
>  
>  	ret = i915_gem_gtt_prepare_object(obj);
>  	if (ret)
> -		goto err_out;
> +		goto err_remove_node;
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> @@ -3167,9 +3167,11 @@ search_free:
>  	i915_gem_verify_gtt(dev);
>  	return 0;
>  
> -err_out:
> +err_remove_node:
>  	drm_mm_remove_node(&vma->node);
> +err_free_vma:
>  	i915_gem_vma_destroy(vma);
> +err_unpin:
>  	i915_gem_object_unpin_pages(obj);
>  	return ret;
>  }

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Daniel Vetter July 22, 2013, 9:02 p.m. UTC | #3
On Mon, Jul 22, 2013 at 01:12:30PM -0700, Ben Widawsky wrote:
> On Mon, Jul 22, 2013 at 12:12:38PM +0200, Daniel Vetter wrote:
> > This has been broken in
> > 
> > commit 2f63315692b1d3c055972ad33fc7168ae908b97b
> > Author: Ben Widawsky <ben@bwidawsk.net>
> > Date:   Wed Jul 17 12:19:03 2013 -0700
> > 
> >     drm/i915: Create VMAs
> > 
> > which resulted in an OOPS the first time around we've hit -ENOSPC.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
> > Cc: Imre Deak <imre.deak@intel.com>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Tested-by: meng <mengmeng.meng@intel.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++------
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index cfa6588..c87a6ec 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3121,8 +3121,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  
> >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> >  	if (IS_ERR(vma)) {
> > -		i915_gem_object_unpin_pages(obj);
> > -		return PTR_ERR(vma);
> > +		ret = PTR_ERR(vma);
> > +		goto err_unpin;
> >  	}
> 
> Adding the extra goto here seems pointless to me.

Like explained on irc, that's just to have a nice OCD reverse err_foo:
label stacking at the end of the function.

> 
> >  
> >  search_free:
> > @@ -3138,17 +3138,17 @@ search_free:
> >  		if (ret == 0)
> >  			goto search_free;
> >  
> > -		goto err_out;
> > +		goto err_free_vma;
> >  	}
> 
> My preference would be to exit early in drm_mm_remove_node() if the node
> isn't allocated. I think at least we should add a WARN to
> drm_mm_remove_node if the node->allocated == 0.

Hm, good idea. I'll create a quick patch.

> >  	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
> >  					      obj->cache_level))) {
> >  		ret = -EINVAL;
> > -		goto err_out;
> > +		goto err_remove_node;
> >  	}
> >  
> >  	ret = i915_gem_gtt_prepare_object(obj);
> >  	if (ret)
> > -		goto err_out;
> > +		goto err_remove_node;
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > @@ -3167,9 +3167,11 @@ search_free:
> >  	i915_gem_verify_gtt(dev);
> >  	return 0;
> >  
> > -err_out:
> > +err_remove_node:
> >  	drm_mm_remove_node(&vma->node);
> > +err_free_vma:
> >  	i915_gem_vma_destroy(vma);
> > +err_unpin:
> >  	i915_gem_object_unpin_pages(obj);
> >  	return ret;
> >  }
> 
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

Queued for -next, thanks for the review.
-Daniel
Daniel Vetter July 23, 2013, 6:10 a.m. UTC | #4
On Mon, Jul 22, 2013 at 11:02:08PM +0200, Daniel Vetter wrote:
> On Mon, Jul 22, 2013 at 01:12:30PM -0700, Ben Widawsky wrote:
> > On Mon, Jul 22, 2013 at 12:12:38PM +0200, Daniel Vetter wrote:
> > > This has been broken in
> > > 
> > > commit 2f63315692b1d3c055972ad33fc7168ae908b97b
> > > Author: Ben Widawsky <ben@bwidawsk.net>
> > > Date:   Wed Jul 17 12:19:03 2013 -0700
> > > 
> > >     drm/i915: Create VMAs
> > > 
> > > which resulted in an OOPS the first time around we've hit -ENOSPC.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156
> > > Cc: Imre Deak <imre.deak@intel.com>
> > > Cc: Ben Widawsky <ben@bwidawsk.net>
> > > Tested-by: meng <mengmeng.meng@intel.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++------
> > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index cfa6588..c87a6ec 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3121,8 +3121,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> > >  
> > >  	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > >  	if (IS_ERR(vma)) {
> > > -		i915_gem_object_unpin_pages(obj);
> > > -		return PTR_ERR(vma);
> > > +		ret = PTR_ERR(vma);
> > > +		goto err_unpin;
> > >  	}
> > 
> > Adding the extra goto here seems pointless to me.
> 
> Like explained on irc, that's just to have a nice OCD reverse err_foo:
> label stacking at the end of the function.
> 
> > 
> > >  
> > >  search_free:
> > > @@ -3138,17 +3138,17 @@ search_free:
> > >  		if (ret == 0)
> > >  			goto search_free;
> > >  
> > > -		goto err_out;
> > > +		goto err_free_vma;
> > >  	}
> > 
> > My preference would be to exit early in drm_mm_remove_node() if the node
> > isn't allocated. I think at least we should add a WARN to
> > drm_mm_remove_node if the node->allocated == 0.
> 
> Hm, good idea. I'll create a quick patch.
> 
> > >  	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
> > >  					      obj->cache_level))) {
> > >  		ret = -EINVAL;
> > > -		goto err_out;
> > > +		goto err_remove_node;
> > >  	}
> > >  
> > >  	ret = i915_gem_gtt_prepare_object(obj);
> > >  	if (ret)
> > > -		goto err_out;
> > > +		goto err_remove_node;
> > >  
> > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > >  	list_add_tail(&obj->mm_list, &vm->inactive_list);
> > > @@ -3167,9 +3167,11 @@ search_free:
> > >  	i915_gem_verify_gtt(dev);
> > >  	return 0;
> > >  
> > > -err_out:
> > > +err_remove_node:
> > >  	drm_mm_remove_node(&vma->node);
> > > +err_free_vma:
> > >  	i915_gem_vma_destroy(vma);
> > > +err_unpin:
> > >  	i915_gem_object_unpin_pages(obj);
> > >  	return ret;
> > >  }
> > 
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Queued for -next, thanks for the review.

So I think this bug here warrants a testcase in igt. I've expected
gem_tiled_blits/interruptible to hit this, but apparently it does not.
So what we want is to fail the drm_mm_insert_node_in_range_generic call
(with -ENOSPC) and then again fail the i915_gem_evict_something call with
an error. That should be simplest to achieve (and indeed does happen as
the bug indicates) when a blocking wait for the gpu to evict a buffer it's
still using.

Hence why I've thought that gem_tiled_blits/interruptible should hit this,
since it thrashes the gtt (so plenty of -ENOSPC) and has the 2nd thread
interrupting us running in parallel (so we should see the occasional
-ERESTARTSYS). But apparently the tuning is wrong for this case. I guess
cranking up the signal rate (we might need a real interface for that
anyway since the current value is the most the wait ioctl test can stomach
before failing). Or using much bigger objects to have more gtt contention
going on. I'd still only use the first fixed part to do blits, but add a
variable part at then end to mix up the gtt drm_mm as much as possible.

Ofc this should be a new subtest so that we don't reduce existing test
coverage.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cfa6588..c87a6ec 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3121,8 +3121,8 @@  i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
 	if (IS_ERR(vma)) {
-		i915_gem_object_unpin_pages(obj);
-		return PTR_ERR(vma);
+		ret = PTR_ERR(vma);
+		goto err_unpin;
 	}
 
 search_free:
@@ -3138,17 +3138,17 @@  search_free:
 		if (ret == 0)
 			goto search_free;
 
-		goto err_out;
+		goto err_free_vma;
 	}
 	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
 					      obj->cache_level))) {
 		ret = -EINVAL;
-		goto err_out;
+		goto err_remove_node;
 	}
 
 	ret = i915_gem_gtt_prepare_object(obj);
 	if (ret)
-		goto err_out;
+		goto err_remove_node;
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &vm->inactive_list);
@@ -3167,9 +3167,11 @@  search_free:
 	i915_gem_verify_gtt(dev);
 	return 0;
 
-err_out:
+err_remove_node:
 	drm_mm_remove_node(&vma->node);
+err_free_vma:
 	i915_gem_vma_destroy(vma);
+err_unpin:
 	i915_gem_object_unpin_pages(obj);
 	return ret;
 }