[03/55] drm/i915: i915_add_request must not fail

Message ID	1432917856-12261-4-git-send-email-John.C.Harrison@Intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Fri, 29 May 2015 17:43:24 +0100 Message-Id: <1432917856-12261-4-git-send-email-John.C.Harrison@Intel.com> In-Reply-To: <1432917856-12261-1-git-send-email-John.C.Harrison@Intel.com> References: <1432917856-12261-1-git-send-email-John.C.Harrison@Intel.com> Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH 03/55] drm/i915: i915_add_request must not fail Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

John Harrison May 29, 2015, 4:43 p.m. UTC

From: John Harrison <John.C.Harrison@Intel.com>

The i915_add_request() function is called to keep track of work that has been
written to the ring buffer. It adds epilogue commands to track progress (seqno
updates and such), moves the request structure onto the right list and other
such house keeping tasks. However, the work itself has already been written to
the ring and will get executed whether or not the add request call succeeds. So
no matter what goes wrong, there isn't a whole lot of point in failing the call.

At the moment, this is fine(ish). If the add request does bail early on and not
do the housekeeping, the request will still float around in the
ring->outstanding_lazy_request field and be picked up next time. It means
multiple pieces of work will be tagged as the same request and driver can't
actually wait for the first piece of work until something else has been
submitted. But it all sort of hangs together.

This patch series is all about removing the OLR and guaranteeing that each piece
of work gets its own personal request. That means that there is no more
'hoovering up of forgotten requests'. If the request does not get tracked then
it will be leaked. Thus the add request call _must_ not fail. The previous patch
should have already ensured that it _will_ not fail by removing the potential
for running out of ring space. This patch enforces the rule by actually removing
the early exit paths and the return code.

Note that if something does manage to fail and the epilogue commands don't get
written to the ring, the driver will still hang together. The request will be
added to the tracking lists. And as in the old case, any subsequent work will
generate a new seqno which will suffice for marking the old one as complete.

v2: Improved WARNings (Tomas Elf review request).

For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
 drivers/gpu/drm/i915/i915_gem.c              |   43 ++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
 drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
 drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
 7 files changed, 31 insertions(+), 40 deletions(-)

Tomas Elf June 2, 2015, 6:16 p.m. UTC | #1

On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The i915_add_request() function is called to keep track of work that has been
> written to the ring buffer. It adds epilogue commands to track progress (seqno
> updates and such), moves the request structure onto the right list and other
> such house keeping tasks. However, the work itself has already been written to
> the ring and will get executed whether or not the add request call succeeds. So
> no matter what goes wrong, there isn't a whole lot of point in failing the call.
>
> At the moment, this is fine(ish). If the add request does bail early on and not
> do the housekeeping, the request will still float around in the
> ring->outstanding_lazy_request field and be picked up next time. It means
> multiple pieces of work will be tagged as the same request and driver can't
> actually wait for the first piece of work until something else has been
> submitted. But it all sort of hangs together.
>
> This patch series is all about removing the OLR and guaranteeing that each piece
> of work gets its own personal request. That means that there is no more
> 'hoovering up of forgotten requests'. If the request does not get tracked then
> it will be leaked. Thus the add request call _must_ not fail. The previous patch
> should have already ensured that it _will_ not fail by removing the potential
> for running out of ring space. This patch enforces the rule by actually removing
> the early exit paths and the return code.
>
> Note that if something does manage to fail and the epilogue commands don't get
> written to the ring, the driver will still hang together. The request will be
> added to the tracking lists. And as in the old case, any subsequent work will
> generate a new seqno which will suffice for marking the old one as complete.
>
> v2: Improved WARNings (Tomas Elf review request).
>
> For: VIZ-5115
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
>   drivers/gpu/drm/i915/i915_gem.c              |   43 ++++++++++++--------------
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
>   drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
>   drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
>   drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
>   7 files changed, 31 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index eba1857..1be4a52 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2860,9 +2860,9 @@ void i915_gem_init_swizzling(struct drm_device *dev);
>   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>   int __must_check i915_gpu_idle(struct drm_device *dev);
>   int __must_check i915_gem_suspend(struct drm_device *dev);
> -int __i915_add_request(struct intel_engine_cs *ring,
> -		       struct drm_file *file,
> -		       struct drm_i915_gem_object *batch_obj);
> +void __i915_add_request(struct intel_engine_cs *ring,
> +			struct drm_file *file,
> +			struct drm_i915_gem_object *batch_obj);
>   #define i915_add_request(ring) \
>   	__i915_add_request(ring, NULL, NULL)
>   int __i915_wait_request(struct drm_i915_gem_request *req,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6f51416..dd39aa5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1155,15 +1155,12 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   int
>   i915_gem_check_olr(struct drm_i915_gem_request *req)
>   {
> -	int ret;
> -
>   	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>
> -	ret = 0;
>   	if (req == req->ring->outstanding_lazy_request)
> -		ret = i915_add_request(req->ring);
> +		i915_add_request(req->ring);
>
> -	return ret;
> +	return 0;
>   }

i915_gem_check_olr never returns anything but 0. How about making it void?

Thanks,
Tomas

>
>   static void fake_irq(unsigned long data)
> @@ -2466,9 +2463,14 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
>   	return 0;
>   }
>
> -int __i915_add_request(struct intel_engine_cs *ring,
> -		       struct drm_file *file,
> -		       struct drm_i915_gem_object *obj)
> +/*
> + * NB: This function is not allowed to fail. Doing so would mean the the
> + * request is not being tracked for completion but the work itself is
> + * going to happen on the hardware. This would be a Bad Thing(tm).
> + */
> +void __i915_add_request(struct intel_engine_cs *ring,
> +			struct drm_file *file,
> +			struct drm_i915_gem_object *obj)
>   {
>   	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>   	struct drm_i915_gem_request *request;
> @@ -2478,7 +2480,7 @@ int __i915_add_request(struct intel_engine_cs *ring,
>
>   	request = ring->outstanding_lazy_request;
>   	if (WARN_ON(request == NULL))
> -		return -ENOMEM;
> +		return;

You have a WARN for the other points of failure in this function, why 
not here?

>
>   	if (i915.enable_execlists) {
>   		ringbuf = request->ctx->engine[ring->id].ringbuf;
> @@ -2500,15 +2502,12 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   	 * is that the flush _must_ happen before the next request, no matter
>   	 * what.
>   	 */
> -	if (i915.enable_execlists) {
> +	if (i915.enable_execlists)
>   		ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
> -		if (ret)
> -			return ret;
> -	} else {
> +	else
>   		ret = intel_ring_flush_all_caches(ring);
> -		if (ret)
> -			return ret;
> -	}
> +	/* Not allowed to fail! */
> +	WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
>
>   	/* Record the position of the start of the request so that
>   	 * should we detect the updated seqno part-way through the
> @@ -2517,17 +2516,15 @@ int __i915_add_request(struct intel_engine_cs *ring,
>   	 */
>   	request->postfix = intel_ring_get_tail(ringbuf);
>
> -	if (i915.enable_execlists) {
> +	if (i915.enable_execlists)
>   		ret = ring->emit_request(ringbuf, request);
> -		if (ret)
> -			return ret;
> -	} else {
> +	else {
>   		ret = ring->add_request(ring);
> -		if (ret)
> -			return ret;
>
>   		request->tail = intel_ring_get_tail(ringbuf);
>   	}
> +	/* Not allowed to fail! */
> +	WARN(ret, "emit|add_request failed: %d!\n", ret);
>
>   	request->head = request_start;
>
> @@ -2576,8 +2573,6 @@ int __i915_add_request(struct intel_engine_cs *ring,
>
>   	/* Sanity check that the reserved size was large enough. */
>   	intel_ring_reserved_space_end(ringbuf);
> -
> -	return 0;
>   }
>
>   static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index bd0e4bd..2b48a31 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1061,7 +1061,7 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev,
>   	ring->gpu_caches_dirty = true;
>
>   	/* Add a breadcrumb for the completion of the batch buffer */
> -	(void)__i915_add_request(ring, file, obj);
> +	__i915_add_request(ring, file, obj);
>   }
>
>   static int
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
> index 521548a..ce4788f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
> @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	ret = __i915_add_request(ring, NULL, so.obj);
> +	__i915_add_request(ring, NULL, so.obj);
>   	/* __i915_add_request moves object to inactive if it fails */
>   out:
>   	i915_gem_render_state_fini(&so);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index e62d396..7a75fc8 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1373,7 +1373,7 @@ static int intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>
>   	i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>
> -	ret = __i915_add_request(ring, file, so.obj);
> +	__i915_add_request(ring, file, so.obj);
>   	/* intel_logical_ring_add_request moves object to inactive if it
>   	 * fails */
>   out:
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 25c8ec6..e7534b9 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -220,9 +220,7 @@ static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
>   	WARN_ON(overlay->last_flip_req);
>   	i915_gem_request_assign(&overlay->last_flip_req,
>   					     ring->outstanding_lazy_request);
> -	ret = i915_add_request(ring);
> -	if (ret)
> -		return ret;
> +	i915_add_request(ring);
>
>   	overlay->flip_tail = tail;
>   	ret = i915_wait_request(overlay->last_flip_req);
> @@ -291,7 +289,9 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>   	WARN_ON(overlay->last_flip_req);
>   	i915_gem_request_assign(&overlay->last_flip_req,
>   					     ring->outstanding_lazy_request);
> -	return i915_add_request(ring);
> +	i915_add_request(ring);
> +
> +	return 0;
>   }
>
>   static void intel_overlay_release_old_vid_tail(struct intel_overlay *overlay)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 74c2222..7061b07 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2156,14 +2156,10 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>   int intel_ring_idle(struct intel_engine_cs *ring)
>   {
>   	struct drm_i915_gem_request *req;
> -	int ret;
>
>   	/* We need to add any requests required to flush the objects and ring */
> -	if (ring->outstanding_lazy_request) {
> -		ret = i915_add_request(ring);
> -		if (ret)
> -			return ret;
> -	}
> +	if (ring->outstanding_lazy_request)
> +		i915_add_request(ring);
>
>   	/* Wait upon the last request to be completed */
>   	if (list_empty(&ring->request_list))
>

John Harrison June 4, 2015, 2:07 p.m. UTC | #2

On 02/06/2015 19:16, Tomas Elf wrote:
> On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The i915_add_request() function is called to keep track of work that 
>> has been
>> written to the ring buffer. It adds epilogue commands to track 
>> progress (seqno
>> updates and such), moves the request structure onto the right list 
>> and other
>> such house keeping tasks. However, the work itself has already been 
>> written to
>> the ring and will get executed whether or not the add request call 
>> succeeds. So
>> no matter what goes wrong, there isn't a whole lot of point in 
>> failing the call.
>>
>> At the moment, this is fine(ish). If the add request does bail early 
>> on and not
>> do the housekeeping, the request will still float around in the
>> ring->outstanding_lazy_request field and be picked up next time. It 
>> means
>> multiple pieces of work will be tagged as the same request and driver 
>> can't
>> actually wait for the first piece of work until something else has been
>> submitted. But it all sort of hangs together.
>>
>> This patch series is all about removing the OLR and guaranteeing that 
>> each piece
>> of work gets its own personal request. That means that there is no more
>> 'hoovering up of forgotten requests'. If the request does not get 
>> tracked then
>> it will be leaked. Thus the add request call _must_ not fail. The 
>> previous patch
>> should have already ensured that it _will_ not fail by removing the 
>> potential
>> for running out of ring space. This patch enforces the rule by 
>> actually removing
>> the early exit paths and the return code.
>>
>> Note that if something does manage to fail and the epilogue commands 
>> don't get
>> written to the ring, the driver will still hang together. The request 
>> will be
>> added to the tracking lists. And as in the old case, any subsequent 
>> work will
>> generate a new seqno which will suffice for marking the old one as 
>> complete.
>>
>> v2: Improved WARNings (Tomas Elf review request).
>>
>> For: VIZ-5115
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
>>   drivers/gpu/drm/i915/i915_gem.c              |   43 
>> ++++++++++++--------------
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
>>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
>>   drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
>>   drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
>>   drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
>>   7 files changed, 31 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index eba1857..1be4a52 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2860,9 +2860,9 @@ void i915_gem_init_swizzling(struct drm_device 
>> *dev);
>>   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>>   int __must_check i915_gpu_idle(struct drm_device *dev);
>>   int __must_check i915_gem_suspend(struct drm_device *dev);
>> -int __i915_add_request(struct intel_engine_cs *ring,
>> -               struct drm_file *file,
>> -               struct drm_i915_gem_object *batch_obj);
>> +void __i915_add_request(struct intel_engine_cs *ring,
>> +            struct drm_file *file,
>> +            struct drm_i915_gem_object *batch_obj);
>>   #define i915_add_request(ring) \
>>       __i915_add_request(ring, NULL, NULL)
>>   int __i915_wait_request(struct drm_i915_gem_request *req,
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c 
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index 6f51416..dd39aa5 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1155,15 +1155,12 @@ i915_gem_check_wedge(struct i915_gpu_error 
>> *error,
>>   int
>>   i915_gem_check_olr(struct drm_i915_gem_request *req)
>>   {
>> -    int ret;
>> -
>> WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>>
>> -    ret = 0;
>>       if (req == req->ring->outstanding_lazy_request)
>> -        ret = i915_add_request(req->ring);
>> +        i915_add_request(req->ring);
>>
>> -    return ret;
>> +    return 0;
>>   }
>
> i915_gem_check_olr never returns anything but 0. How about making it 
> void?
Seems like redundant/unnecessary changes given that the entire function 
is removed later in the series. Changing it from a '__must_check' to a 
'void' now would be a lot of extra changes.

> Thanks,
> Tomas
>
>>
>>   static void fake_irq(unsigned long data)
>> @@ -2466,9 +2463,14 @@ i915_gem_get_seqno(struct drm_device *dev, u32 
>> *seqno)
>>       return 0;
>>   }
>>
>> -int __i915_add_request(struct intel_engine_cs *ring,
>> -               struct drm_file *file,
>> -               struct drm_i915_gem_object *obj)
>> +/*
>> + * NB: This function is not allowed to fail. Doing so would mean the 
>> the
>> + * request is not being tracked for completion but the work itself is
>> + * going to happen on the hardware. This would be a Bad Thing(tm).
>> + */
>> +void __i915_add_request(struct intel_engine_cs *ring,
>> +            struct drm_file *file,
>> +            struct drm_i915_gem_object *obj)
>>   {
>>       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>>       struct drm_i915_gem_request *request;
>> @@ -2478,7 +2480,7 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>
>>       request = ring->outstanding_lazy_request;
>>       if (WARN_ON(request == NULL))
>> -        return -ENOMEM;
>> +        return;
>
> You have a WARN for the other points of failure in this function, why 
> not here?
Erm, you mean like the one in the 'if' itself?

>
>>
>>       if (i915.enable_execlists) {
>>           ringbuf = request->ctx->engine[ring->id].ringbuf;
>> @@ -2500,15 +2502,12 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>        * is that the flush _must_ happen before the next request, no 
>> matter
>>        * what.
>>        */
>> -    if (i915.enable_execlists) {
>> +    if (i915.enable_execlists)
>>           ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
>> -        if (ret)
>> -            return ret;
>> -    } else {
>> +    else
>>           ret = intel_ring_flush_all_caches(ring);
>> -        if (ret)
>> -            return ret;
>> -    }
>> +    /* Not allowed to fail! */
>> +    WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
>>
>>       /* Record the position of the start of the request so that
>>        * should we detect the updated seqno part-way through the
>> @@ -2517,17 +2516,15 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>        */
>>       request->postfix = intel_ring_get_tail(ringbuf);
>>
>> -    if (i915.enable_execlists) {
>> +    if (i915.enable_execlists)
>>           ret = ring->emit_request(ringbuf, request);
>> -        if (ret)
>> -            return ret;
>> -    } else {
>> +    else {
>>           ret = ring->add_request(ring);
>> -        if (ret)
>> -            return ret;
>>
>>           request->tail = intel_ring_get_tail(ringbuf);
>>       }
>> +    /* Not allowed to fail! */
>> +    WARN(ret, "emit|add_request failed: %d!\n", ret);
>>
>>       request->head = request_start;
>>
>> @@ -2576,8 +2573,6 @@ int __i915_add_request(struct intel_engine_cs 
>> *ring,
>>
>>       /* Sanity check that the reserved size was large enough. */
>>       intel_ring_reserved_space_end(ringbuf);
>> -
>> -    return 0;
>>   }
>>
>>   static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index bd0e4bd..2b48a31 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1061,7 +1061,7 @@ i915_gem_execbuffer_retire_commands(struct 
>> drm_device *dev,
>>       ring->gpu_caches_dirty = true;
>>
>>       /* Add a breadcrumb for the completion of the batch buffer */
>> -    (void)__i915_add_request(ring, file, obj);
>> +    __i915_add_request(ring, file, obj);
>>   }
>>
>>   static int
>> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c 
>> b/drivers/gpu/drm/i915/i915_gem_render_state.c
>> index 521548a..ce4788f 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
>> @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct 
>> intel_engine_cs *ring)
>>
>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>
>> -    ret = __i915_add_request(ring, NULL, so.obj);
>> +    __i915_add_request(ring, NULL, so.obj);
>>       /* __i915_add_request moves object to inactive if it fails */
>>   out:
>>       i915_gem_render_state_fini(&so);
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index e62d396..7a75fc8 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1373,7 +1373,7 @@ static int 
>> intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>>
>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>
>> -    ret = __i915_add_request(ring, file, so.obj);
>> +    __i915_add_request(ring, file, so.obj);
>>       /* intel_logical_ring_add_request moves object to inactive if it
>>        * fails */
>>   out:
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c 
>> b/drivers/gpu/drm/i915/intel_overlay.c
>> index 25c8ec6..e7534b9 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -220,9 +220,7 @@ static int intel_overlay_do_wait_request(struct 
>> intel_overlay *overlay,
>>       WARN_ON(overlay->last_flip_req);
>>       i915_gem_request_assign(&overlay->last_flip_req,
>>                            ring->outstanding_lazy_request);
>> -    ret = i915_add_request(ring);
>> -    if (ret)
>> -        return ret;
>> +    i915_add_request(ring);
>>
>>       overlay->flip_tail = tail;
>>       ret = i915_wait_request(overlay->last_flip_req);
>> @@ -291,7 +289,9 @@ static int intel_overlay_continue(struct 
>> intel_overlay *overlay,
>>       WARN_ON(overlay->last_flip_req);
>>       i915_gem_request_assign(&overlay->last_flip_req,
>>                            ring->outstanding_lazy_request);
>> -    return i915_add_request(ring);
>> +    i915_add_request(ring);
>> +
>> +    return 0;
>>   }
>>
>>   static void intel_overlay_release_old_vid_tail(struct intel_overlay 
>> *overlay)
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 74c2222..7061b07 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2156,14 +2156,10 @@ static int intel_wrap_ring_buffer(struct 
>> intel_engine_cs *ring)
>>   int intel_ring_idle(struct intel_engine_cs *ring)
>>   {
>>       struct drm_i915_gem_request *req;
>> -    int ret;
>>
>>       /* We need to add any requests required to flush the objects 
>> and ring */
>> -    if (ring->outstanding_lazy_request) {
>> -        ret = i915_add_request(ring);
>> -        if (ret)
>> -            return ret;
>> -    }
>> +    if (ring->outstanding_lazy_request)
>> +        i915_add_request(ring);
>>
>>       /* Wait upon the last request to be completed */
>>       if (list_empty(&ring->request_list))
>>
>

Tomas Elf June 5, 2015, 10:55 a.m. UTC | #3

On 04/06/2015 15:07, John Harrison wrote:
> On 02/06/2015 19:16, Tomas Elf wrote:
>> On 29/05/2015 17:43, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> The i915_add_request() function is called to keep track of work that
>>> has been
>>> written to the ring buffer. It adds epilogue commands to track
>>> progress (seqno
>>> updates and such), moves the request structure onto the right list
>>> and other
>>> such house keeping tasks. However, the work itself has already been
>>> written to
>>> the ring and will get executed whether or not the add request call
>>> succeeds. So
>>> no matter what goes wrong, there isn't a whole lot of point in
>>> failing the call.
>>>
>>> At the moment, this is fine(ish). If the add request does bail early
>>> on and not
>>> do the housekeeping, the request will still float around in the
>>> ring->outstanding_lazy_request field and be picked up next time. It
>>> means
>>> multiple pieces of work will be tagged as the same request and driver
>>> can't
>>> actually wait for the first piece of work until something else has been
>>> submitted. But it all sort of hangs together.
>>>
>>> This patch series is all about removing the OLR and guaranteeing that
>>> each piece
>>> of work gets its own personal request. That means that there is no more
>>> 'hoovering up of forgotten requests'. If the request does not get
>>> tracked then
>>> it will be leaked. Thus the add request call _must_ not fail. The
>>> previous patch
>>> should have already ensured that it _will_ not fail by removing the
>>> potential
>>> for running out of ring space. This patch enforces the rule by
>>> actually removing
>>> the early exit paths and the return code.
>>>
>>> Note that if something does manage to fail and the epilogue commands
>>> don't get
>>> written to the ring, the driver will still hang together. The request
>>> will be
>>> added to the tracking lists. And as in the old case, any subsequent
>>> work will
>>> generate a new seqno which will suffice for marking the old one as
>>> complete.
>>>
>>> v2: Improved WARNings (Tomas Elf review request).
>>>
>>> For: VIZ-5115
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h              |    6 ++--
>>>   drivers/gpu/drm/i915/i915_gem.c              |   43
>>> ++++++++++++--------------
>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   |    2 +-
>>>   drivers/gpu/drm/i915/i915_gem_render_state.c |    2 +-
>>>   drivers/gpu/drm/i915/intel_lrc.c             |    2 +-
>>>   drivers/gpu/drm/i915/intel_overlay.c         |    8 ++---
>>>   drivers/gpu/drm/i915/intel_ringbuffer.c      |    8 ++---
>>>   7 files changed, 31 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index eba1857..1be4a52 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2860,9 +2860,9 @@ void i915_gem_init_swizzling(struct drm_device
>>> *dev);
>>>   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>>>   int __must_check i915_gpu_idle(struct drm_device *dev);
>>>   int __must_check i915_gem_suspend(struct drm_device *dev);
>>> -int __i915_add_request(struct intel_engine_cs *ring,
>>> -               struct drm_file *file,
>>> -               struct drm_i915_gem_object *batch_obj);
>>> +void __i915_add_request(struct intel_engine_cs *ring,
>>> +            struct drm_file *file,
>>> +            struct drm_i915_gem_object *batch_obj);
>>>   #define i915_add_request(ring) \
>>>       __i915_add_request(ring, NULL, NULL)
>>>   int __i915_wait_request(struct drm_i915_gem_request *req,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>> b/drivers/gpu/drm/i915/i915_gem.c
>>> index 6f51416..dd39aa5 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -1155,15 +1155,12 @@ i915_gem_check_wedge(struct i915_gpu_error
>>> *error,
>>>   int
>>>   i915_gem_check_olr(struct drm_i915_gem_request *req)
>>>   {
>>> -    int ret;
>>> -
>>> WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
>>>
>>> -    ret = 0;
>>>       if (req == req->ring->outstanding_lazy_request)
>>> -        ret = i915_add_request(req->ring);
>>> +        i915_add_request(req->ring);
>>>
>>> -    return ret;
>>> +    return 0;
>>>   }
>>
>> i915_gem_check_olr never returns anything but 0. How about making it
>> void?
> Seems like redundant/unnecessary changes given that the entire function
> is removed later in the series. Changing it from a '__must_check' to a
> 'void' now would be a lot of extra changes.
>
>> Thanks,
>> Tomas
>>
>>>
>>>   static void fake_irq(unsigned long data)
>>> @@ -2466,9 +2463,14 @@ i915_gem_get_seqno(struct drm_device *dev, u32
>>> *seqno)
>>>       return 0;
>>>   }
>>>
>>> -int __i915_add_request(struct intel_engine_cs *ring,
>>> -               struct drm_file *file,
>>> -               struct drm_i915_gem_object *obj)
>>> +/*
>>> + * NB: This function is not allowed to fail. Doing so would mean the
>>> the
>>> + * request is not being tracked for completion but the work itself is
>>> + * going to happen on the hardware. This would be a Bad Thing(tm).
>>> + */
>>> +void __i915_add_request(struct intel_engine_cs *ring,
>>> +            struct drm_file *file,
>>> +            struct drm_i915_gem_object *obj)
>>>   {
>>>       struct drm_i915_private *dev_priv = ring->dev->dev_private;
>>>       struct drm_i915_gem_request *request;
>>> @@ -2478,7 +2480,7 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>
>>>       request = ring->outstanding_lazy_request;
>>>       if (WARN_ON(request == NULL))
>>> -        return -ENOMEM;
>>> +        return;
>>
>> You have a WARN for the other points of failure in this function, why
>> not here?
> Erm, you mean like the one in the 'if' itself?
>

Exactly! :)

Reviewed-by: Tomas Elf <tomas.elf@intel.com>

Thanks,
Tomas

>>
>>>
>>>       if (i915.enable_execlists) {
>>>           ringbuf = request->ctx->engine[ring->id].ringbuf;
>>> @@ -2500,15 +2502,12 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>        * is that the flush _must_ happen before the next request, no
>>> matter
>>>        * what.
>>>        */
>>> -    if (i915.enable_execlists) {
>>> +    if (i915.enable_execlists)
>>>           ret = logical_ring_flush_all_caches(ringbuf, request->ctx);
>>> -        if (ret)
>>> -            return ret;
>>> -    } else {
>>> +    else
>>>           ret = intel_ring_flush_all_caches(ring);
>>> -        if (ret)
>>> -            return ret;
>>> -    }
>>> +    /* Not allowed to fail! */
>>> +    WARN(ret, "*_ring_flush_all_caches failed: %d!\n", ret);
>>>
>>>       /* Record the position of the start of the request so that
>>>        * should we detect the updated seqno part-way through the
>>> @@ -2517,17 +2516,15 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>        */
>>>       request->postfix = intel_ring_get_tail(ringbuf);
>>>
>>> -    if (i915.enable_execlists) {
>>> +    if (i915.enable_execlists)
>>>           ret = ring->emit_request(ringbuf, request);
>>> -        if (ret)
>>> -            return ret;
>>> -    } else {
>>> +    else {
>>>           ret = ring->add_request(ring);
>>> -        if (ret)
>>> -            return ret;
>>>
>>>           request->tail = intel_ring_get_tail(ringbuf);
>>>       }
>>> +    /* Not allowed to fail! */
>>> +    WARN(ret, "emit|add_request failed: %d!\n", ret);
>>>
>>>       request->head = request_start;
>>>
>>> @@ -2576,8 +2573,6 @@ int __i915_add_request(struct intel_engine_cs
>>> *ring,
>>>
>>>       /* Sanity check that the reserved size was large enough. */
>>>       intel_ring_reserved_space_end(ringbuf);
>>> -
>>> -    return 0;
>>>   }
>>>
>>>   static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> index bd0e4bd..2b48a31 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> @@ -1061,7 +1061,7 @@ i915_gem_execbuffer_retire_commands(struct
>>> drm_device *dev,
>>>       ring->gpu_caches_dirty = true;
>>>
>>>       /* Add a breadcrumb for the completion of the batch buffer */
>>> -    (void)__i915_add_request(ring, file, obj);
>>> +    __i915_add_request(ring, file, obj);
>>>   }
>>>
>>>   static int
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> b/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> index 521548a..ce4788f 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct
>>> intel_engine_cs *ring)
>>>
>>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>>
>>> -    ret = __i915_add_request(ring, NULL, so.obj);
>>> +    __i915_add_request(ring, NULL, so.obj);
>>>       /* __i915_add_request moves object to inactive if it fails */
>>>   out:
>>>       i915_gem_render_state_fini(&so);
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index e62d396..7a75fc8 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -1373,7 +1373,7 @@ static int
>>> intel_lr_context_render_state_init(struct intel_engine_cs *ring,
>>>
>>>       i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring);
>>>
>>> -    ret = __i915_add_request(ring, file, so.obj);
>>> +    __i915_add_request(ring, file, so.obj);
>>>       /* intel_logical_ring_add_request moves object to inactive if it
>>>        * fails */
>>>   out:
>>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c
>>> b/drivers/gpu/drm/i915/intel_overlay.c
>>> index 25c8ec6..e7534b9 100644
>>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>>> @@ -220,9 +220,7 @@ static int intel_overlay_do_wait_request(struct
>>> intel_overlay *overlay,
>>>       WARN_ON(overlay->last_flip_req);
>>>       i915_gem_request_assign(&overlay->last_flip_req,
>>>                            ring->outstanding_lazy_request);
>>> -    ret = i915_add_request(ring);
>>> -    if (ret)
>>> -        return ret;
>>> +    i915_add_request(ring);
>>>
>>>       overlay->flip_tail = tail;
>>>       ret = i915_wait_request(overlay->last_flip_req);
>>> @@ -291,7 +289,9 @@ static int intel_overlay_continue(struct
>>> intel_overlay *overlay,
>>>       WARN_ON(overlay->last_flip_req);
>>>       i915_gem_request_assign(&overlay->last_flip_req,
>>>                            ring->outstanding_lazy_request);
>>> -    return i915_add_request(ring);
>>> +    i915_add_request(ring);
>>> +
>>> +    return 0;
>>>   }
>>>
>>>   static void intel_overlay_release_old_vid_tail(struct intel_overlay
>>> *overlay)
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> index 74c2222..7061b07 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> @@ -2156,14 +2156,10 @@ static int intel_wrap_ring_buffer(struct
>>> intel_engine_cs *ring)
>>>   int intel_ring_idle(struct intel_engine_cs *ring)
>>>   {
>>>       struct drm_i915_gem_request *req;
>>> -    int ret;
>>>
>>>       /* We need to add any requests required to flush the objects
>>> and ring */
>>> -    if (ring->outstanding_lazy_request) {
>>> -        ret = i915_add_request(ring);
>>> -        if (ret)
>>> -            return ret;
>>> -    }
>>> +    if (ring->outstanding_lazy_request)
>>> +        i915_add_request(ring);
>>>
>>>       /* Wait upon the last request to be completed */
>>>       if (list_empty(&ring->request_list))
>>>
>>
>

Chris Wilson June 23, 2015, 10:16 a.m. UTC | #4

On Fri, May 29, 2015 at 05:43:24PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The i915_add_request() function is called to keep track of work that has been
> written to the ring buffer. It adds epilogue commands to track progress (seqno
> updates and such), moves the request structure onto the right list and other
> such house keeping tasks. However, the work itself has already been written to
> the ring and will get executed whether or not the add request call succeeds. So
> no matter what goes wrong, there isn't a whole lot of point in failing the call.
> 
> At the moment, this is fine(ish). If the add request does bail early on and not
> do the housekeeping, the request will still float around in the
> ring->outstanding_lazy_request field and be picked up next time. It means
> multiple pieces of work will be tagged as the same request and driver can't
> actually wait for the first piece of work until something else has been
> submitted. But it all sort of hangs together.
> 
> This patch series is all about removing the OLR and guaranteeing that each piece
> of work gets its own personal request. That means that there is no more
> 'hoovering up of forgotten requests'. If the request does not get tracked then
> it will be leaked. Thus the add request call _must_ not fail. The previous patch
> should have already ensured that it _will_ not fail by removing the potential
> for running out of ring space. This patch enforces the rule by actually removing
> the early exit paths and the return code.
> 
> Note that if something does manage to fail and the epilogue commands don't get
> written to the ring, the driver will still hang together. The request will be
> added to the tracking lists. And as in the old case, any subsequent work will
> generate a new seqno which will suffice for marking the old one as complete.
> 
> v2: Improved WARNings (Tomas Elf review request).

Nak. Daniel please revert this mess.

Even in the current code it has a failure mode it cannot handle.
-Chris

John Harrison June 23, 2015, 10:47 a.m. UTC | #5

On 23/06/2015 11:16, Chris Wilson wrote:
> On Fri, May 29, 2015 at 05:43:24PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The i915_add_request() function is called to keep track of work that has been
>> written to the ring buffer. It adds epilogue commands to track progress (seqno
>> updates and such), moves the request structure onto the right list and other
>> such house keeping tasks. However, the work itself has already been written to
>> the ring and will get executed whether or not the add request call succeeds. So
>> no matter what goes wrong, there isn't a whole lot of point in failing the call.
>>
>> At the moment, this is fine(ish). If the add request does bail early on and not
>> do the housekeeping, the request will still float around in the
>> ring->outstanding_lazy_request field and be picked up next time. It means
>> multiple pieces of work will be tagged as the same request and driver can't
>> actually wait for the first piece of work until something else has been
>> submitted. But it all sort of hangs together.
>>
>> This patch series is all about removing the OLR and guaranteeing that each piece
>> of work gets its own personal request. That means that there is no more
>> 'hoovering up of forgotten requests'. If the request does not get tracked then
>> it will be leaked. Thus the add request call _must_ not fail. The previous patch
>> should have already ensured that it _will_ not fail by removing the potential
>> for running out of ring space. This patch enforces the rule by actually removing
>> the early exit paths and the return code.
>>
>> Note that if something does manage to fail and the epilogue commands don't get
>> written to the ring, the driver will still hang together. The request will be
>> added to the tracking lists. And as in the old case, any subsequent work will
>> generate a new seqno which will suffice for marking the old one as complete.
>>
>> v2: Improved WARNings (Tomas Elf review request).
> Nak. Daniel please revert this mess.
>
> Even in the current code it has a failure mode it cannot handle.
> -Chris
>

Can you explain further?

Is this a new failure mode? This patch was originally posted back in 
March and there were no comments then to say that is was not the right 
direction to take. The discussion was that this might not be the 
prettiest way to do things but it is the best solution given where the 
driver is at and what we are trying to do.

John.

[03/55] drm/i915: i915_add_request must not fail

Commit Message

Comments

Patch