Message ID | 20211109122037.171128-1-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Skip error capture when wedged on init | expand |
On Tue, 9 Nov 2021 at 12:20, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Trying to capture uninitialised engines when we wedged on init ends in > tears. Skip that together with uC capture, since failure to initialise the > latter can actually be one of the reasons for wedging on init. > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> This fixes the issue with missing GuC wedging the GPU and then blowing up when trying to use the driver? Reviewed-by: Matthew Auld <matthew.auld@intel.com> > --- > drivers/gpu/drm/i915/i915_gpu_error.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 2a2d7643b551..aa2b3aad9643 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -1866,10 +1866,14 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask) > } > > gt_record_info(error->gt); > - gt_record_engines(error->gt, engine_mask, compress); > > - if (INTEL_INFO(i915)->has_gt_uc) > - error->gt->uc = gt_record_uc(error->gt, compress); > + if (!intel_gt_has_unrecoverable_error(gt)) { > + gt_record_engines(error->gt, engine_mask, compress); > + > + if (INTEL_INFO(i915)->has_gt_uc) > + error->gt->uc = gt_record_uc(error->gt, > + compress); > + } > > i915_vma_capture_finish(error->gt, compress); > > -- > 2.30.2 >
On 10/11/2021 10:48, Matthew Auld wrote: > On Tue, 9 Nov 2021 at 12:20, Tvrtko Ursulin > <tvrtko.ursulin@linux.intel.com> wrote: >> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> >> Trying to capture uninitialised engines when we wedged on init ends in >> tears. Skip that together with uC capture, since failure to initialise the >> latter can actually be one of the reasons for wedging on init. >> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > This fixes the issue with missing GuC wedging the GPU and then blowing > up when trying to use the driver? Probably does not blow up when using the driver, but definitely does when accessing error state. Someone suggested it would instead be better to call i915_disable_error_state from wedge on init/fini, and I think indeed it would, so I plan to send v2 looking like that. Regards, Tvrtko > Reviewed-by: Matthew Auld <matthew.auld@intel.com> > >> --- >> drivers/gpu/drm/i915/i915_gpu_error.c | 10 +++++++--- >> 1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >> index 2a2d7643b551..aa2b3aad9643 100644 >> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >> @@ -1866,10 +1866,14 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask) >> } >> >> gt_record_info(error->gt); >> - gt_record_engines(error->gt, engine_mask, compress); >> >> - if (INTEL_INFO(i915)->has_gt_uc) >> - error->gt->uc = gt_record_uc(error->gt, compress); >> + if (!intel_gt_has_unrecoverable_error(gt)) { >> + gt_record_engines(error->gt, engine_mask, compress); >> + >> + if (INTEL_INFO(i915)->has_gt_uc) >> + error->gt->uc = gt_record_uc(error->gt, >> + compress); >> + } >> >> i915_vma_capture_finish(error->gt, compress); >> >> -- >> 2.30.2 >>
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 2a2d7643b551..aa2b3aad9643 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1866,10 +1866,14 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask) } gt_record_info(error->gt); - gt_record_engines(error->gt, engine_mask, compress); - if (INTEL_INFO(i915)->has_gt_uc) - error->gt->uc = gt_record_uc(error->gt, compress); + if (!intel_gt_has_unrecoverable_error(gt)) { + gt_record_engines(error->gt, engine_mask, compress); + + if (INTEL_INFO(i915)->has_gt_uc) + error->gt->uc = gt_record_uc(error->gt, + compress); + } i915_vma_capture_finish(error->gt, compress);