Message ID | 20240808204943.911727-1-jesus.narvaez@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/guc: Change GEM_WARN_ON to guc_err to prevent taints in CI | expand |
-----Original Message----- From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Jesus Narvaez Sent: Thursday, August 8, 2024 1:50 PM To: intel-gfx@lists.freedesktop.org Cc: Narvaez, Jesus <jesus.narvaez@intel.com> Subject: [PATCH] drm/i915/guc: Change GEM_WARN_ON to guc_err to prevent taints in CI > > This warning was supposed to catch a harmless issue, but changing to > guc_error should prevent kernel taints in CI runs. I'll defer to your judgement on this, but IMO if we just want to log when this happens and guarantee we don't throw kernel taints in CI, we might want to consider using guc_info instead. Not blocking on it, though. Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> -Jonathan Cavitt > > Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > index 9400d0eb682b..c3a5d9e1288e 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > @@ -2014,11 +2014,12 @@ void intel_guc_submission_reset_finish(struct intel_guc *guc) > > /* > * Technically possible for either of these values to be non-zero here, > - * but very unlikely + harmless. Regardless let's add a warn so we can > + * but very unlikely + harmless. Regardless let's add an error so we can > * see in CI if this happens frequently / a precursor to taking down the > * machine. > */ > - GEM_WARN_ON(atomic_read(&guc->outstanding_submission_g2h)); > + if (atomic_read(&guc->outstanding_submission_g2h)) > + guc_err(guc, "Unexpected outstanding GuC to Host in reset finish\n"); > atomic_set(&guc->outstanding_submission_g2h, 0); > > intel_guc_global_policies_update(guc); > -- > 2.34.1 > >
On 8/9/2024 2:39 PM, Cavitt, Jonathan wrote: > -----Original Message----- > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Jesus Narvaez > Sent: Thursday, August 8, 2024 1:50 PM > To: intel-gfx@lists.freedesktop.org > Cc: Narvaez, Jesus <jesus.narvaez@intel.com> > Subject: [PATCH] drm/i915/guc: Change GEM_WARN_ON to guc_err to prevent taints in CI >> This warning was supposed to catch a harmless issue, but changing to >> guc_error should prevent kernel taints in CI runs. > I'll defer to your judgement on this, but IMO if we just want to log when this > happens and guarantee we don't throw kernel taints in CI, we might want to > consider using guc_info instead. Not blocking on it, though. > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> > -Jonathan Cavitt Hi Jonathan, I think using guc_err would be preferred because we still want to catch the error when it happens, along with preventing kernel taints in CI. - Jesus Narvaez > >> Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> >> --- >> drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >> index 9400d0eb682b..c3a5d9e1288e 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c >> @@ -2014,11 +2014,12 @@ void intel_guc_submission_reset_finish(struct intel_guc *guc) >> >> /* >> * Technically possible for either of these values to be non-zero here, >> - * but very unlikely + harmless. Regardless let's add a warn so we can >> + * but very unlikely + harmless. Regardless let's add an error so we can >> * see in CI if this happens frequently / a precursor to taking down the >> * machine. >> */ >> - GEM_WARN_ON(atomic_read(&guc->outstanding_submission_g2h)); >> + if (atomic_read(&guc->outstanding_submission_g2h)) >> + guc_err(guc, "Unexpected outstanding GuC to Host in reset finish\n"); >> atomic_set(&guc->outstanding_submission_g2h, 0); >> >> intel_guc_global_policies_update(guc); >> -- >> 2.34.1 >> >>
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 9400d0eb682b..c3a5d9e1288e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2014,11 +2014,12 @@ void intel_guc_submission_reset_finish(struct intel_guc *guc) /* * Technically possible for either of these values to be non-zero here, - * but very unlikely + harmless. Regardless let's add a warn so we can + * but very unlikely + harmless. Regardless let's add an error so we can * see in CI if this happens frequently / a precursor to taking down the * machine. */ - GEM_WARN_ON(atomic_read(&guc->outstanding_submission_g2h)); + if (atomic_read(&guc->outstanding_submission_g2h)) + guc_err(guc, "Unexpected outstanding GuC to Host in reset finish\n"); atomic_set(&guc->outstanding_submission_g2h, 0); intel_guc_global_policies_update(guc);
This warning was supposed to catch a harmless issue, but changing to guc_error should prevent kernel taints in CI runs. Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)