diff mbox

[2/6] drm/i915: Reset ring space estimate after unwinding the request

Message ID 20180307134226.25492-2-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson March 7, 2018, 1:42 p.m. UTC
With a series of unusual events (a sequence of interrupted request
allocations), we could gradually leak the ring->space estimate by
unwinding the ring back to the start of the request, but not return the
used space back to the ring. Eventually and with great misfortune, it
would be possible to trigger ring->space exhaustion with no requests on
the ring.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_request.c     | 1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
 2 files changed, 2 insertions(+)

Comments

Mika Kuoppala March 9, 2018, 1:17 p.m. UTC | #1
Chris Wilson <chris@chris-wilson.co.uk> writes:

> With a series of unusual events (a sequence of interrupted request
> allocations), we could gradually leak the ring->space estimate by
> unwinding the ring back to the start of the request, but not return the
> used space back to the ring. Eventually and with great misfortune, it
> would be possible to trigger ring->space exhaustion with no requests on
> the ring.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_request.c     | 1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index d437beac3969..efa9ee557f31 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -798,6 +798,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>  
>  err_unwind:
>  	rq->ring->emit = rq->head;
> +	intel_ring_update_space(rq->ring);
>  
>  	/* Make sure we didn't add ourselves to external state before freeing */
>  	GEM_BUG_ON(!list_empty(&rq->active_list));
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 1d599524a759..88eeb64041ae 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1593,6 +1593,7 @@ static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes)
>  	if (intel_ring_update_space(ring) >= bytes)
>  		return 0;
>  
> +	GEM_BUG_ON(list_empty(&ring->request_list));

At some point, after long series of misfortunate events, we
will add a first request and need actually wait a space for this
and then we kaboom in here?

-Mika

>  	list_for_each_entry(target, &ring->request_list, ring_link) {
>  		/* Would completion of this request free enough space? */
>  		if (bytes <= __intel_ring_space(target->postfix,
> -- 
> 2.16.2
Chris Wilson March 9, 2018, 1:20 p.m. UTC | #2
Quoting Chris Wilson (2018-03-07 13:42:22)
> With a series of unusual events (a sequence of interrupted request
> allocations), we could gradually leak the ring->space estimate by
> unwinding the ring back to the start of the request, but not return the
> used space back to the ring. Eventually and with great misfortune, it
> would be possible to trigger ring->space exhaustion with no requests on
> the ring.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_request.c     | 1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index d437beac3969..efa9ee557f31 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -798,6 +798,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>  
>  err_unwind:
>         rq->ring->emit = rq->head;
> +       intel_ring_update_space(rq->ring);

Ok, skip this one as we will correct ourselves next time we
wait_for_space. It's just the next one where we weren't maintaining
ring->tail that was the issue.
-Chris
Chris Wilson March 9, 2018, 1:42 p.m. UTC | #3
Quoting Mika Kuoppala (2018-03-09 13:17:09)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > With a series of unusual events (a sequence of interrupted request
> > allocations), we could gradually leak the ring->space estimate by
> > unwinding the ring back to the start of the request, but not return the
> > used space back to the ring. Eventually and with great misfortune, it
> > would be possible to trigger ring->space exhaustion with no requests on
> > the ring.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_request.c     | 1 +
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
> >  2 files changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index d437beac3969..efa9ee557f31 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -798,6 +798,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> >  
> >  err_unwind:
> >       rq->ring->emit = rq->head;
> > +     intel_ring_update_space(rq->ring);
> >  
> >       /* Make sure we didn't add ourselves to external state before freeing */
> >       GEM_BUG_ON(!list_empty(&rq->active_list));
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 1d599524a759..88eeb64041ae 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1593,6 +1593,7 @@ static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes)
> >       if (intel_ring_update_space(ring) >= bytes)
> >               return 0;
> >  
> > +     GEM_BUG_ON(list_empty(&ring->request_list));
> 
> At some point, after long series of misfortunate events, we
> will add a first request and need actually wait a space for this
> and then we kaboom in here?

That's the experience. However, I don't think this patch helps because
we do the intel_ring_update_space() here inside wait_for_space anyway,
so it should come out in the wash so long as we aren't corrupting
the ring space calculation.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index d437beac3969..efa9ee557f31 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -798,6 +798,7 @@  i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 
 err_unwind:
 	rq->ring->emit = rq->head;
+	intel_ring_update_space(rq->ring);
 
 	/* Make sure we didn't add ourselves to external state before freeing */
 	GEM_BUG_ON(!list_empty(&rq->active_list));
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 1d599524a759..88eeb64041ae 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1593,6 +1593,7 @@  static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes)
 	if (intel_ring_update_space(ring) >= bytes)
 		return 0;
 
+	GEM_BUG_ON(list_empty(&ring->request_list));
 	list_for_each_entry(target, &ring->request_list, ring_link) {
 		/* Would completion of this request free enough space? */
 		if (bytes <= __intel_ring_space(target->postfix,