diff mbox

drm/i915: check if execlist_port is empty before using its content

Message ID 20161223054636.3924-1-changbin.du@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Du, Changbin Dec. 23, 2016, 5:46 a.m. UTC
From: "Du, Changbin" <changbin.du@intel.com>

This patch fix a crash in function reset_common_ring. In this case,
the port[0].request is null when reset the render ring, so a null
dereference exception is raised. We need to check execlist_port status
first.

[   35.748034] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
[   35.749567] IP: [<ffffffff81521bfe>] reset_common_ring+0xbe/0x150
[   35.749567] Call Trace:
[   35.749567]  [<ffffffff8150ded0>] i915_gem_reset+0x150/0x270
[   35.749567]  [<ffffffff814d3c0a>] i915_reset+0x8a/0xe0
[   35.749567]  [<ffffffff814d8c21>] i915_reset_and_wakeup+0x131/0x160
[   35.749567]  [<ffffffff815298f0>] ? gen5_read8+0x110/0x110
[   35.749567]  [<ffffffff814dc97a>] i915_handle_error+0xca/0x5a0
[   35.749567]  [<ffffffff813bac9d>] ? scnprintf+0x3d/0x70
[   35.749567]  [<ffffffff814dd063>] i915_hangcheck_elapsed+0x213/0x510
[   35.749567]  [<ffffffff810c4c4b>] process_one_work+0x15b/0x470
[   35.749567]  [<ffffffff810c4fa3>] worker_thread+0x43/0x4d0
[   35.749567]  [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
[   35.749567]  [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
[   35.749567]  [<ffffffff810c103e>] ? call_usermodehelper_exec_async+0x12e/0x130
[   35.749567]  [<ffffffff810ca1a5>] kthread+0xc5/0xe0
[   35.749567]  [<ffffffff810ca0e0>] ? kthread_park+0x60/0x60
[   35.749567]  [<ffffffff810c0f10>] ? umh_complete+0x40/0x40
[   35.749567]  [<ffffffff81a35392>] ret_from_fork+0x22/0x30

Signed-off-by: Changbin Du <changbin.du@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jani Nikula Dec. 23, 2016, 7:04 a.m. UTC | #1
On Fri, 23 Dec 2016, changbin.du@intel.com wrote:
> From: "Du, Changbin" <changbin.du@intel.com>
>
> This patch fix a crash in function reset_common_ring. In this case,
> the port[0].request is null when reset the render ring, so a null
> dereference exception is raised. We need to check execlist_port status
> first.
>
> [   35.748034] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
> [   35.749567] IP: [<ffffffff81521bfe>] reset_common_ring+0xbe/0x150
> [   35.749567] Call Trace:
> [   35.749567]  [<ffffffff8150ded0>] i915_gem_reset+0x150/0x270
> [   35.749567]  [<ffffffff814d3c0a>] i915_reset+0x8a/0xe0
> [   35.749567]  [<ffffffff814d8c21>] i915_reset_and_wakeup+0x131/0x160
> [   35.749567]  [<ffffffff815298f0>] ? gen5_read8+0x110/0x110
> [   35.749567]  [<ffffffff814dc97a>] i915_handle_error+0xca/0x5a0
> [   35.749567]  [<ffffffff813bac9d>] ? scnprintf+0x3d/0x70
> [   35.749567]  [<ffffffff814dd063>] i915_hangcheck_elapsed+0x213/0x510
> [   35.749567]  [<ffffffff810c4c4b>] process_one_work+0x15b/0x470
> [   35.749567]  [<ffffffff810c4fa3>] worker_thread+0x43/0x4d0
> [   35.749567]  [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
> [   35.749567]  [<ffffffff810c4f60>] ? process_one_work+0x470/0x470
> [   35.749567]  [<ffffffff810c103e>] ? call_usermodehelper_exec_async+0x12e/0x130
> [   35.749567]  [<ffffffff810ca1a5>] kthread+0xc5/0xe0
> [   35.749567]  [<ffffffff810ca0e0>] ? kthread_park+0x60/0x60
> [   35.749567]  [<ffffffff810c0f10>] ? umh_complete+0x40/0x40
> [   35.749567]  [<ffffffff81a35392>] ret_from_fork+0x22/0x30
>

Fixes: ?

i.e. which commit broke things?

BR,
Jani.


> Signed-off-by: Changbin Du <changbin.du@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 0a09024..81a9b0b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1450,7 +1450,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
>  
>  	/* Catch up with any missed context-switch interrupts */
>  	I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0xffff, 0));
> -	if (request->ctx != port[0].request->ctx) {
> +	if (!execlists_elsp_idle(engine) && request->ctx != port[0].request->ctx) {
>  		i915_gem_request_put(port[0].request);
>  		port[0] = port[1];
>  		memset(&port[1], 0, sizeof(port[1]));
Chris Wilson Dec. 23, 2016, 7:51 a.m. UTC | #2
On Fri, Dec 23, 2016 at 01:46:36PM +0800, changbin.du@intel.com wrote:
> From: "Du, Changbin" <changbin.du@intel.com>
> 
> This patch fix a crash in function reset_common_ring. In this case,
> the port[0].request is null when reset the render ring, so a null
> dereference exception is raised. We need to check execlist_port status
> first.

No. The root cause is whatever got you into the illegal condition in the
first place.
-Chris
Du, Changbin Dec. 26, 2016, 7:41 a.m. UTC | #3
> On Fri, Dec 23, 2016 at 01:46:36PM +0800, changbin.du@intel.com wrote:
> > From: "Du, Changbin" <changbin.du@intel.com>
> >
> > This patch fix a crash in function reset_common_ring. In this case,
> > the port[0].request is null when reset the render ring, so a null
> > dereference exception is raised. We need to check execlist_port status
> > first.
> 
> No. The root cause is whatever got you into the illegal condition in the
> first place.
> -Chris
> 
Thanks, I will restudy the code after process my current job. Since this happen
on gvt guest, so this may related to gvt emulation.

> --
> Chris Wilson, Intel Open Source Technology Centre
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0a09024..81a9b0b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1450,7 +1450,7 @@  static void reset_common_ring(struct intel_engine_cs *engine,
 
 	/* Catch up with any missed context-switch interrupts */
 	I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0xffff, 0));
-	if (request->ctx != port[0].request->ctx) {
+	if (!execlists_elsp_idle(engine) && request->ctx != port[0].request->ctx) {
 		i915_gem_request_put(port[0].request);
 		port[0] = port[1];
 		memset(&port[1], 0, sizeof(port[1]));