Message ID | 20190617100917.13110-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/guc: Reduce verbosity on log overflows | expand |
On 17/06/2019 11:09, Chris Wilson wrote: > If the user is clearing the log buffer too slowly, we overflow. As this > is an expected condition, and the driver tries to handle it, reduce the > error message down to a notice. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110817 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/intel_guc_log.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c > index bf1446629703..e3b83ecb90b5 100644 > --- a/drivers/gpu/drm/i915/intel_guc_log.c > +++ b/drivers/gpu/drm/i915/intel_guc_log.c > @@ -208,7 +208,9 @@ static bool guc_check_log_buf_overflow(struct intel_guc_log *log, > /* buffer_full_cnt is a 4 bit counter */ > log->stats[type].sampled_overflow += 16; > } > - DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); > + > + dev_notice_ratelimited(guc_to_i915(log_to_guc(log))->drm.dev, > + "GuC log buffer overflow\n"); > } > > return overflow; > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Maybe we could even go debug, although I not sure if userspace is expected to look at dmesg to detect overflows or if the thing has some marker it writes into the file in that case. I don't see it so that's probably a slight flaw. Regards, Tvrtko
Quoting Tvrtko Ursulin (2019-06-17 15:21:54) > > On 17/06/2019 11:09, Chris Wilson wrote: > > If the user is clearing the log buffer too slowly, we overflow. As this > > is an expected condition, and the driver tries to handle it, reduce the > > error message down to a notice. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110817 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > --- > > drivers/gpu/drm/i915/intel_guc_log.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c > > index bf1446629703..e3b83ecb90b5 100644 > > --- a/drivers/gpu/drm/i915/intel_guc_log.c > > +++ b/drivers/gpu/drm/i915/intel_guc_log.c > > @@ -208,7 +208,9 @@ static bool guc_check_log_buf_overflow(struct intel_guc_log *log, > > /* buffer_full_cnt is a 4 bit counter */ > > log->stats[type].sampled_overflow += 16; > > } > > - DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); > > + > > + dev_notice_ratelimited(guc_to_i915(log_to_guc(log))->drm.dev, > > + "GuC log buffer overflow\n"); > > } > > > > return overflow; > > > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Maybe we could even go debug, although I not sure if userspace is > expected to look at dmesg to detect overflows or if the thing has some > marker it writes into the file in that case. I don't see it so that's > probably a slight flaw. It wasn't completely clear to me how well userspace did handle this (I haven't run the debug tools myself) so settled for leaving a hint visible in dmesg. Ideally, if we are expecting userspace to be handling the ringbuf interface, it should be coping with the overflows then this is just noise and can be removed. -Chris
On Mon, 17 Jun 2019 12:09:17 +0200, Chris Wilson <chris@chris-wilson.co.uk> wrote: > If the user is clearing the log buffer too slowly, we overflow. As this > is an expected condition, and the driver tries to handle it, reduce the > error message down to a notice. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110817 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/intel_guc_log.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_guc_log.c > b/drivers/gpu/drm/i915/intel_guc_log.c > index bf1446629703..e3b83ecb90b5 100644 > --- a/drivers/gpu/drm/i915/intel_guc_log.c > +++ b/drivers/gpu/drm/i915/intel_guc_log.c > @@ -208,7 +208,9 @@ static bool guc_check_log_buf_overflow(struct > intel_guc_log *log, > /* buffer_full_cnt is a 4 bit counter */ > log->stats[type].sampled_overflow += 16; > } > - DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); > + > + dev_notice_ratelimited(guc_to_i915(log_to_guc(log))->drm.dev, > + "GuC log buffer overflow\n"); While this change alone is not harmful, I'm afraid that there might be another reason why we see this message : once buffer_full_cnt is turned on by the fw and then we reset the Guc, this field may have stale value as it is not cleared by us/uc and we may wrongly assume that there was an overflow. > } > return overflow;
Quoting Michal Wajdeczko (2019-06-17 15:38:53) > On Mon, 17 Jun 2019 12:09:17 +0200, Chris Wilson > <chris@chris-wilson.co.uk> wrote: > > > If the user is clearing the log buffer too slowly, we overflow. As this > > is an expected condition, and the driver tries to handle it, reduce the > > error message down to a notice. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110817 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > --- > > drivers/gpu/drm/i915/intel_guc_log.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_guc_log.c > > b/drivers/gpu/drm/i915/intel_guc_log.c > > index bf1446629703..e3b83ecb90b5 100644 > > --- a/drivers/gpu/drm/i915/intel_guc_log.c > > +++ b/drivers/gpu/drm/i915/intel_guc_log.c > > @@ -208,7 +208,9 @@ static bool guc_check_log_buf_overflow(struct > > intel_guc_log *log, > > /* buffer_full_cnt is a 4 bit counter */ > > log->stats[type].sampled_overflow += 16; > > } > > - DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); > > + > > + dev_notice_ratelimited(guc_to_i915(log_to_guc(log))->drm.dev, > > + "GuC log buffer overflow\n"); > > While this change alone is not harmful, I'm afraid that there might > be another reason why we see this message : once buffer_full_cnt is > turned on by the fw and then we reset the Guc, this field may have > stale value as it is not cleared by us/uc and we may wrongly assume > that there was an overflow. Fwiw, at the moment, it is being thrown by the cat /debug/dri/0/* test. -Chris
Quoting Michal Wajdeczko (2019-06-17 15:38:53) > On Mon, 17 Jun 2019 12:09:17 +0200, Chris Wilson > <chris@chris-wilson.co.uk> wrote: > > > If the user is clearing the log buffer too slowly, we overflow. As this > > is an expected condition, and the driver tries to handle it, reduce the > > error message down to a notice. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110817 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > --- > > drivers/gpu/drm/i915/intel_guc_log.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_guc_log.c > > b/drivers/gpu/drm/i915/intel_guc_log.c > > index bf1446629703..e3b83ecb90b5 100644 > > --- a/drivers/gpu/drm/i915/intel_guc_log.c > > +++ b/drivers/gpu/drm/i915/intel_guc_log.c > > @@ -208,7 +208,9 @@ static bool guc_check_log_buf_overflow(struct > > intel_guc_log *log, > > /* buffer_full_cnt is a 4 bit counter */ > > log->stats[type].sampled_overflow += 16; > > } > > - DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); > > + > > + dev_notice_ratelimited(guc_to_i915(log_to_guc(log))->drm.dev, > > + "GuC log buffer overflow\n"); > > While this change alone is not harmful, I'm afraid that there might > be another reason why we see this message : once buffer_full_cnt is > turned on by the fw and then we reset the Guc, this field may have > stale value as it is not cleared by us/uc and we may wrongly assume > that there was an overflow. So yay/nay on silencing the test output so I can blissfully ignore it in bugzilla? -Chris
On Tue, 18 Jun 2019 12:24:30 +0200, Chris Wilson <chris@chris-wilson.co.uk> wrote: > Quoting Michal Wajdeczko (2019-06-17 15:38:53) >> On Mon, 17 Jun 2019 12:09:17 +0200, Chris Wilson >> <chris@chris-wilson.co.uk> wrote: >> >> > If the user is clearing the log buffer too slowly, we overflow. As >> this >> > is an expected condition, and the driver tries to handle it, reduce >> the >> > error message down to a notice. >> > >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110817 >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> > --- >> > drivers/gpu/drm/i915/intel_guc_log.c | 4 +++- >> > 1 file changed, 3 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_guc_log.c >> > b/drivers/gpu/drm/i915/intel_guc_log.c >> > index bf1446629703..e3b83ecb90b5 100644 >> > --- a/drivers/gpu/drm/i915/intel_guc_log.c >> > +++ b/drivers/gpu/drm/i915/intel_guc_log.c >> > @@ -208,7 +208,9 @@ static bool guc_check_log_buf_overflow(struct >> > intel_guc_log *log, >> > /* buffer_full_cnt is a 4 bit counter */ >> > log->stats[type].sampled_overflow += 16; >> > } >> > - DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); >> > + >> > + >> dev_notice_ratelimited(guc_to_i915(log_to_guc(log))->drm.dev, >> > + "GuC log buffer overflow\n"); >> >> While this change alone is not harmful, I'm afraid that there might >> be another reason why we see this message : once buffer_full_cnt is >> turned on by the fw and then we reset the Guc, this field may have >> stale value as it is not cleared by us/uc and we may wrongly assume >> that there was an overflow. > > So yay/nay on silencing the test output so I can blissfully ignore it in > bugzilla? OK to merge, we can debug extra 'notice' logs separately Michal
diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c index bf1446629703..e3b83ecb90b5 100644 --- a/drivers/gpu/drm/i915/intel_guc_log.c +++ b/drivers/gpu/drm/i915/intel_guc_log.c @@ -208,7 +208,9 @@ static bool guc_check_log_buf_overflow(struct intel_guc_log *log, /* buffer_full_cnt is a 4 bit counter */ log->stats[type].sampled_overflow += 16; } - DRM_ERROR_RATELIMITED("GuC log buffer overflow\n"); + + dev_notice_ratelimited(guc_to_i915(log_to_guc(log))->drm.dev, + "GuC log buffer overflow\n"); } return overflow;
If the user is clearing the log buffer too slowly, we overflow. As this is an expected condition, and the driver tries to handle it, reduce the error message down to a notice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110817 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/intel_guc_log.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)