Message ID | 1423729562-11051-1-git-send-email-mika.kuoppala@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Feb 12, 2015 at 10:26:02AM +0200, Mika Kuoppala wrote: > We use the pid of the process which opened our device when > we track which was the culprit of the gpu hang. But as that > file descriptor might get inherited, we might blame the > wrong process when we record the error state. > > Track process identifiers in requests to always find > the correct offender. > > v2: Track only user processes (Chris) > > Cc: Kenneth Graunke <kenneth@whitecape.org> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > --- > @@ -2572,6 +2574,9 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request) > list_del(&request->list); > i915_gem_request_remove_from_client(request); > > + if (request->pid) put_pid() does the NULL check itself, might as well take advantage of that. > + put_pid(request->pid); > + > i915_gem_request_unreference(request); > } Otherwise, Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
On Thu, Feb 12, 2015 at 08:51:20AM +0000, Chris Wilson wrote: > On Thu, Feb 12, 2015 at 10:26:02AM +0200, Mika Kuoppala wrote: > > We use the pid of the process which opened our device when > > we track which was the culprit of the gpu hang. But as that > > file descriptor might get inherited, we might blame the > > wrong process when we record the error state. > > > > Track process identifiers in requests to always find > > the correct offender. > > > > v2: Track only user processes (Chris) > > > > Cc: Kenneth Graunke <kenneth@whitecape.org> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > > --- > > @@ -2572,6 +2574,9 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request) > > list_del(&request->list); > > i915_gem_request_remove_from_client(request); > > > > + if (request->pid) > > put_pid() does the NULL check itself, might as well take advantage of > that. Done while merging. > > > + put_pid(request->pid); > > + > > i915_gem_request_unreference(request); > > } > > Otherwise, > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Queued for -next, thanks for the patch. -Daniel
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 5766
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
PNV 282/282 282/282
ILK 313/313 313/313
SNB 309/323 309/323
IVB 380/380 380/380
BYT 296/296 296/296
HSW -1 425/425 424/425
BDW -1 318/318 317/318
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*HSW igt_gem_storedw_loop_vebox PASS(2) DMESG_WARN(1)PASS(1)
*BDW igt_gem_gtt_hog PASS(8) DMESG_WARN(1)PASS(1)
Note: You need to pay more attention to line start with '*'
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c0b8644..9093654 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2153,6 +2153,9 @@ struct drm_i915_gem_request { /** file_priv list entry for this request */ struct list_head client_list; + /** process identifier submitting this request */ + struct pid *pid; + uint32_t uniq; /** diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c26d36c..2bb2e12 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2492,6 +2492,8 @@ int __i915_add_request(struct intel_engine_cs *ring, list_add_tail(&request->client_list, &file_priv->mm.request_list); spin_unlock(&file_priv->mm.lock); + + request->pid = get_pid(task_pid(current)); } trace_i915_gem_request_add(request); @@ -2572,6 +2574,9 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request) list_del(&request->list); i915_gem_request_remove_from_client(request); + if (request->pid) + put_pid(request->pid); + i915_gem_request_unreference(request); } diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 48ddbf4..a982849 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -994,12 +994,11 @@ static void i915_gem_record_rings(struct drm_device *dev, i915_error_ggtt_object_create(dev_priv, ring->scratch.obj); - if (request->file_priv) { + if (request->pid) { struct task_struct *task; rcu_read_lock(); - task = pid_task(request->file_priv->file->pid, - PIDTYPE_PID); + task = pid_task(request->pid, PIDTYPE_PID); if (task) { strcpy(error->ring[i].comm, task->comm); error->ring[i].pid = task->pid;
We use the pid of the process which opened our device when we track which was the culprit of the gpu hang. But as that file descriptor might get inherited, we might blame the wrong process when we record the error state. Track process identifiers in requests to always find the correct offender. v2: Track only user processes (Chris) Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/i915_gem.c | 5 +++++ drivers/gpu/drm/i915/i915_gpu_error.c | 5 ++--- 3 files changed, 10 insertions(+), 3 deletions(-)