[4/5] drm/i915: properly SIGBUS on I/O errors

Message ID	1341433123-23055-5-git-send-email-daniel.vetter@ffwll.ch (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org> From: Daniel Vetter <daniel.vetter@ffwll.ch> To: Intel Graphics Development <intel-gfx@lists.freedesktop.org> Date: Wed, 4 Jul 2012 22:18:42 +0200 Message-Id: <1341433123-23055-5-git-send-email-daniel.vetter@ffwll.ch> In-Reply-To: <1341433123-23055-1-git-send-email-daniel.vetter@ffwll.ch> References: <1341433123-23055-1-git-send-email-daniel.vetter@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Subject: [Intel-gfx] [PATCH 4/5] drm/i915: properly SIGBUS on I/O errors Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Message ID

1341433123-23055-5-git-send-email-daniel.vetter@ffwll.ch (mailing list archive)

State

New, archived

Headers

From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Date: Wed,  4 Jul 2012 22:18:42 +0200
Message-Id: <1341433123-23055-5-git-send-email-daniel.vetter@ffwll.ch>
In-Reply-To: <1341433123-23055-1-git-send-email-daniel.vetter@ffwll.ch>
References: <1341433123-23055-1-git-send-email-daniel.vetter@ffwll.ch>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: [Intel-gfx] [PATCH 4/5] drm/i915: properly SIGBUS on I/O errors
Precedence: list
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org
Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Commit Message

Daniel Vetter July 4, 2012, 8:18 p.m. UTC

... instead of looping endless with no hope of ever serving that
page-fault. We only need to break out of this loop when the gpu died,
to run the reset work (and hopefully resurrect it).

This seems to have been lost in:

commit d9bc7e9f32716901c617e1f0fb6ce0f74f172686
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Feb 7 13:09:31 2011 +0000

    drm/i915: Fix infinite loop regression from 21dd3734

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c |    5 +++++
 1 file changed, 5 insertions(+)

Comments

Daniel Vetter July 4, 2012, 8:40 p.m. UTC | #1

On Wed, Jul 04, 2012 at 10:18:42PM +0200, Daniel Vetter wrote:
> ... instead of looping endless with no hope of ever serving that
> page-fault. We only need to break out of this loop when the gpu died,
> to run the reset work (and hopefully resurrect it).

To clarify questions Chris raised on irc: This is about handling I/O
errors not from our own code, but e.g. when the disk died when trying to
swap in a gem bo. So this patch remidies the issue that the current
handling only handles gpu-death-induced cases of -EIO. Admittedly, dying
disks are much rarer than hanging gpus ...

I'll add that blurb to the commit.

-Daniel
> 
> This seems to have been lost in:
> 
> commit d9bc7e9f32716901c617e1f0fb6ce0f74f172686
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Feb 7 13:09:31 2011 +0000
> 
>     drm/i915: Fix infinite loop regression from 21dd3734
> 
> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_gem.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7d28555..2b54142 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1141,6 +1141,11 @@ unlock:
>  out:
>  	switch (ret) {
>  	case -EIO:
> +		/* If this -EIO is due to a gpu hang, give the reset code a
> +		 * chance to clean up the mess. Otherwise return the proper
> +		 * SIGBUS. */
> +		if (!atomic_read(&dev_priv->mm.wedged))
> +			return VM_FAULT_SIGBUS;
>  	case -EAGAIN:
>  		/* Give the error handler a chance to run and move the
>  		 * objects off the GPU active list. Next time we service the
> -- 
> 1.7.10
>

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7d28555..2b54142 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1141,6 +1141,11 @@  unlock:
 out:
 	switch (ret) {
 	case -EIO:
+		/* If this -EIO is due to a gpu hang, give the reset code a
+		 * chance to clean up the mess. Otherwise return the proper
+		 * SIGBUS. */
+		if (!atomic_read(&dev_priv->mm.wedged))
+			return VM_FAULT_SIGBUS;
 	case -EAGAIN:
 		/* Give the error handler a chance to run and move the
 		 * objects off the GPU active list. Next time we service the

[4/5] drm/i915: properly SIGBUS on I/O errors

Commit Message

Comments

Patch