Message ID | 20220728022028.2190627-4-John.C.Harrison@Intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fixes and improvements to GuC logging and error capture | expand |
I have a question on below code. Everything else looked good. Will r-b as soon as we can close on below question ...alan On Wed, 2022-07-27 at 19:20 -0700, John.C.Harrison@Intel.com wrote: > From: John Harrison <John.C.Harrison@Intel.com> > > It is useful to be able to match GuC events to kernel events when > looking at the GuC log. That requires being able to convert GuC > timestamps to kernel time. So, when dumping error captures and/or GuC > logs, include a stamp in both time zones plus the clock frequency. > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -1675,6 +1678,13 @@ gt_record_uc(struct intel_gt_coredump *gt, > */ > error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL); > error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL); > + > + /* > + * Save the GuC log and include a timestamp reference for converting the > + * log times to system times (in conjunction with the error->boottime and > + * gt->clock_frequency fields saved elsewhere). > + */ > + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); Alan:this register is in the GUC-SHIM domain and so unless i am mistaken u might need to ensure we hold a wakeref so that are getting a live value of the real timestamp register that this register is mirror-ing and not a stale snapshot. Or was this already done farther up the stack? Or are we doing the opposite - in which case we should ensure we drop al wakeref prior to this point. (which i am not sure is a reliable method since we wouldnt know what GuC ref was at). > error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, > "GuC log buffer", compress); >
On 8/4/2022 17:40, Teres Alexis, Alan Previn wrote: > I have a question on below code. Everything else looked good. > Will r-b as soon as we can close on below question > ...alan > > > On Wed, 2022-07-27 at 19:20 -0700, John.C.Harrison@Intel.com wrote: >> From: John Harrison <John.C.Harrison@Intel.com> >> >> It is useful to be able to match GuC events to kernel events when >> looking at the GuC log. That requires being able to convert GuC >> timestamps to kernel time. So, when dumping error captures and/or GuC >> logs, include a stamp in both time zones plus the clock frequency. >> >> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> >> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >> @@ -1675,6 +1678,13 @@ gt_record_uc(struct intel_gt_coredump *gt, >> */ >> error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL); >> error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL); >> + >> + /* >> + * Save the GuC log and include a timestamp reference for converting the >> + * log times to system times (in conjunction with the error->boottime and >> + * gt->clock_frequency fields saved elsewhere). >> + */ >> + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); > Alan:this register is in the GUC-SHIM domain and so unless i am mistaken u might need to ensure we hold a wakeref so > that are getting a live value of the real timestamp register that this register is mirror-ing and not a stale snapshot. > Or was this already done farther up the stack? Or are we doing the opposite - in which case we should ensure we drop al > wakeref prior to this point. (which i am not sure is a reliable method since we wouldnt know what GuC ref was at). The intel_uncore_read() does a forcewake acquire implicitly. Not sure what you mean about dropping all wakerefs prior to this point? John. >> error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, >> "GuC log buffer", compress); >>
Sounds good - thanks. (ignore the "doing the opposite" comment). Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> On Mon, 2022-08-08 at 11:43 -0700, Harrison, John C wrote: > On 8/4/2022 17:40, Teres Alexis, Alan Previn wrote: > > I have a question on below code. Everything else looked good. > > Will r-b as soon as we can close on below question > > ...alan > > > > > > On Wed, 2022-07-27 at 19:20 -0700, John.C.Harrison@Intel.com wrote: > > > From: John Harrison <John.C.Harrison@Intel.com> > > > > > > It is useful to be able to match GuC events to kernel events when > > > looking at the GuC log. That requires being able to convert GuC > > > timestamps to kernel time. So, when dumping error captures and/or GuC > > > logs, include a stamp in both time zones plus the clock frequency. > > > > > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > > @@ -1675,6 +1678,13 @@ gt_record_uc(struct intel_gt_coredump *gt, > > > */ > > > error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL); > > > error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL); > > > + > > > + /* > > > + * Save the GuC log and include a timestamp reference for converting the > > > + * log times to system times (in conjunction with the error->boottime and > > > + * gt->clock_frequency fields saved elsewhere). > > > + */ > > > + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); > > Alan:this register is in the GUC-SHIM domain and so unless i am mistaken u might need to ensure we hold a wakeref so > > that are getting a live value of the real timestamp register that this register is mirror-ing and not a stale snapshot. > > Or was this already done farther up the stack? Or are we doing the opposite - in which case we should ensure we drop al > > wakeref prior to this point. (which i am not sure is a reliable method since we wouldnt know what GuC ref was at). > The intel_uncore_read() does a forcewake acquire implicitly. > > Not sure what you mean about dropping all wakerefs prior to this point? > > John. > > > > error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, > > > "GuC log buffer", compress); > > >
On Wed, 27 Jul 2022, John.C.Harrison@Intel.com wrote: > From: John Harrison <John.C.Harrison@Intel.com> > > It is useful to be able to match GuC events to kernel events when > looking at the GuC log. That requires being able to convert GuC > timestamps to kernel time. So, when dumping error captures and/or GuC > logs, include a stamp in both time zones plus the clock frequency. > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > --- > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 2 ++ > drivers/gpu/drm/i915/gt/uc/intel_guc.c | 19 +++++++++++++++++++ > drivers/gpu/drm/i915/gt/uc/intel_guc.h | 2 ++ > drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 2 ++ > drivers/gpu/drm/i915/i915_gpu_error.c | 12 ++++++++++++ > drivers/gpu/drm/i915/i915_gpu_error.h | 3 +++ > 6 files changed, 40 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > index 60d6eb5f245b7..fc7979bd91db5 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > @@ -1007,6 +1007,8 @@ > #define GEN11_LSN_UNSLCVC_GAFS_HALF_CL2_MAXALLOC (1 << 9) > #define GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC (1 << 7) > > +#define GUCPMTIMESTAMP _MMIO(0xc3e8) > + > #define __GEN9_RCS0_MOCS0 0xc800 > #define GEN9_GFX_MOCS(i) _MMIO(__GEN9_RCS0_MOCS0 + (i) * 4) > #define __GEN9_VCS0_MOCS0 0xc900 > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c > index 2706a8c650900..ab4aacc516aa4 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c > @@ -389,6 +389,25 @@ void intel_guc_write_params(struct intel_guc *guc) > intel_uncore_forcewake_put(uncore, FORCEWAKE_GT); > } > > +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p) > +{ > + struct intel_gt *gt = guc_to_gt(guc); > + intel_wakeref_t wakeref; > + u32 stamp = 0; > + u64 ktime; > + > + intel_device_info_print_runtime(RUNTIME_INFO(gt->i915), p); Why does this function print runtime info alone? Seems kind of random, considering what intel_device_info_print_runtime() actually prints. Should it print both device info and runtime info, or nothing at all? This conflicts with [1] and I don't know what to do about it. The first instinct is to just remove it. BR, Jani. [1] https://patchwork.freedesktop.org/patch/msgid/4be86d7a0737b2c49eee460d29d42843418536fe.1660137416.git.jani.nikula@intel.com > + > + with_intel_runtime_pm(>->i915->runtime_pm, wakeref) > + stamp = intel_uncore_read(gt->uncore, GUCPMTIMESTAMP); > + ktime = ktime_get_boottime_ns(); > + > + drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", ktime, ktime); > + drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", stamp, stamp); > + drm_printf(p, "CS timestamp frequency: %u Hz, %u ns\n", > + gt->clock_frequency, gt->clock_period_ns); > +} > + > int intel_guc_init(struct intel_guc *guc) > { > struct intel_gt *gt = guc_to_gt(guc); > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h > index a7acffbf15d1f..804133df1ac9b 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h > @@ -464,4 +464,6 @@ void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p); > > void intel_guc_write_barrier(struct intel_guc *guc); > > +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p); > + > #endif > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > index 991d4a02248dc..07d31ae32f765 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > @@ -764,6 +764,8 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p, > if (!obj) > return 0; > > + intel_guc_dump_time_info(guc, p); > + > map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); > if (IS_ERR(map)) { > DRM_DEBUG("Failed to pin object\n"); > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index 32e92651ef7c2..addba75252343 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -678,6 +678,7 @@ static void err_print_uc(struct drm_i915_error_state_buf *m, > > intel_uc_fw_dump(&error_uc->guc_fw, &p); > intel_uc_fw_dump(&error_uc->huc_fw, &p); > + err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp); > intel_gpu_error_print_vma(m, NULL, error_uc->guc_log); > } > > @@ -720,6 +721,8 @@ static void err_print_gt_global_nonguc(struct drm_i915_error_state_buf *m, > int i; > > err_printf(m, "GT awake: %s\n", str_yes_no(gt->awake)); > + err_printf(m, "CS timestamp frequency: %u Hz, %d ns\n", > + gt->clock_frequency, gt->clock_period_ns); > err_printf(m, "EIR: 0x%08x\n", gt->eir); > err_printf(m, "PGTBL_ER: 0x%08x\n", gt->pgtbl_er); > > @@ -1675,6 +1678,13 @@ gt_record_uc(struct intel_gt_coredump *gt, > */ > error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL); > error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL); > + > + /* > + * Save the GuC log and include a timestamp reference for converting the > + * log times to system times (in conjunction with the error->boottime and > + * gt->clock_frequency fields saved elsewhere). > + */ > + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); > error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, > "GuC log buffer", compress); > > @@ -1833,6 +1843,8 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt) > static void gt_record_info(struct intel_gt_coredump *gt) > { > memcpy(>->info, >->_gt->info, sizeof(struct intel_gt_info)); > + gt->clock_frequency = gt->_gt->clock_frequency; > + gt->clock_period_ns = gt->_gt->clock_period_ns; > } > > /* > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h > index 55a143b92d10e..d8a8b3d529e09 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.h > +++ b/drivers/gpu/drm/i915/i915_gpu_error.h > @@ -150,6 +150,8 @@ struct intel_gt_coredump { > u32 gtt_cache; > u32 aux_err; /* gen12 */ > u32 gam_done; /* gen12 */ > + u32 clock_frequency; > + u32 clock_period_ns; > > /* Display related */ > u32 derrmr; > @@ -164,6 +166,7 @@ struct intel_gt_coredump { > struct intel_uc_fw guc_fw; > struct intel_uc_fw huc_fw; > struct i915_vma_coredump *guc_log; > + u32 timestamp; > bool is_guc_capture; > } *uc;
On 8/19/2022 03:45, Jani Nikula wrote: > On Wed, 27 Jul 2022, John.C.Harrison@Intel.com wrote: >> From: John Harrison <John.C.Harrison@Intel.com> >> >> It is useful to be able to match GuC events to kernel events when >> looking at the GuC log. That requires being able to convert GuC >> timestamps to kernel time. So, when dumping error captures and/or GuC >> logs, include a stamp in both time zones plus the clock frequency. >> >> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> >> --- >> drivers/gpu/drm/i915/gt/intel_gt_regs.h | 2 ++ >> drivers/gpu/drm/i915/gt/uc/intel_guc.c | 19 +++++++++++++++++++ >> drivers/gpu/drm/i915/gt/uc/intel_guc.h | 2 ++ >> drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 2 ++ >> drivers/gpu/drm/i915/i915_gpu_error.c | 12 ++++++++++++ >> drivers/gpu/drm/i915/i915_gpu_error.h | 3 +++ >> 6 files changed, 40 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h >> index 60d6eb5f245b7..fc7979bd91db5 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h >> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h >> @@ -1007,6 +1007,8 @@ >> #define GEN11_LSN_UNSLCVC_GAFS_HALF_CL2_MAXALLOC (1 << 9) >> #define GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC (1 << 7) >> >> +#define GUCPMTIMESTAMP _MMIO(0xc3e8) >> + >> #define __GEN9_RCS0_MOCS0 0xc800 >> #define GEN9_GFX_MOCS(i) _MMIO(__GEN9_RCS0_MOCS0 + (i) * 4) >> #define __GEN9_VCS0_MOCS0 0xc900 >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c >> index 2706a8c650900..ab4aacc516aa4 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c >> @@ -389,6 +389,25 @@ void intel_guc_write_params(struct intel_guc *guc) >> intel_uncore_forcewake_put(uncore, FORCEWAKE_GT); >> } >> >> +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p) >> +{ >> + struct intel_gt *gt = guc_to_gt(guc); >> + intel_wakeref_t wakeref; >> + u32 stamp = 0; >> + u64 ktime; >> + >> + intel_device_info_print_runtime(RUNTIME_INFO(gt->i915), p); > Why does this function print runtime info alone? Seems kind of random, > considering what intel_device_info_print_runtime() actually > prints. Should it print both device info and runtime info, or nothing at > all? Hmm. That was added specifically to print the rawclk value (and only for the rawclk value) because that was the frequency that the GuC time stamp register ticks at. I think we later worked out that the CS frequency was more correct. Hence printing gt->clock_frequency lower down. I guess I forgot to go back and remove the rawclk print, though. So yeah, it can be removed. John. > > This conflicts with [1] and I don't know what to do about it. The first > instinct is to just remove it. > > BR, > Jani. > > > [1] https://patchwork.freedesktop.org/patch/msgid/4be86d7a0737b2c49eee460d29d42843418536fe.1660137416.git.jani.nikula@intel.com > >> + >> + with_intel_runtime_pm(>->i915->runtime_pm, wakeref) >> + stamp = intel_uncore_read(gt->uncore, GUCPMTIMESTAMP); >> + ktime = ktime_get_boottime_ns(); >> + >> + drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", ktime, ktime); >> + drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", stamp, stamp); >> + drm_printf(p, "CS timestamp frequency: %u Hz, %u ns\n", >> + gt->clock_frequency, gt->clock_period_ns); >> +} >> + >> int intel_guc_init(struct intel_guc *guc) >> { >> struct intel_gt *gt = guc_to_gt(guc); >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h >> index a7acffbf15d1f..804133df1ac9b 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h >> @@ -464,4 +464,6 @@ void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p); >> >> void intel_guc_write_barrier(struct intel_guc *guc); >> >> +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p); >> + >> #endif >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c >> index 991d4a02248dc..07d31ae32f765 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c >> @@ -764,6 +764,8 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p, >> if (!obj) >> return 0; >> >> + intel_guc_dump_time_info(guc, p); >> + >> map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); >> if (IS_ERR(map)) { >> DRM_DEBUG("Failed to pin object\n"); >> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c >> index 32e92651ef7c2..addba75252343 100644 >> --- a/drivers/gpu/drm/i915/i915_gpu_error.c >> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c >> @@ -678,6 +678,7 @@ static void err_print_uc(struct drm_i915_error_state_buf *m, >> >> intel_uc_fw_dump(&error_uc->guc_fw, &p); >> intel_uc_fw_dump(&error_uc->huc_fw, &p); >> + err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp); >> intel_gpu_error_print_vma(m, NULL, error_uc->guc_log); >> } >> >> @@ -720,6 +721,8 @@ static void err_print_gt_global_nonguc(struct drm_i915_error_state_buf *m, >> int i; >> >> err_printf(m, "GT awake: %s\n", str_yes_no(gt->awake)); >> + err_printf(m, "CS timestamp frequency: %u Hz, %d ns\n", >> + gt->clock_frequency, gt->clock_period_ns); >> err_printf(m, "EIR: 0x%08x\n", gt->eir); >> err_printf(m, "PGTBL_ER: 0x%08x\n", gt->pgtbl_er); >> >> @@ -1675,6 +1678,13 @@ gt_record_uc(struct intel_gt_coredump *gt, >> */ >> error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL); >> error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL); >> + >> + /* >> + * Save the GuC log and include a timestamp reference for converting the >> + * log times to system times (in conjunction with the error->boottime and >> + * gt->clock_frequency fields saved elsewhere). >> + */ >> + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); >> error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, >> "GuC log buffer", compress); >> >> @@ -1833,6 +1843,8 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt) >> static void gt_record_info(struct intel_gt_coredump *gt) >> { >> memcpy(>->info, >->_gt->info, sizeof(struct intel_gt_info)); >> + gt->clock_frequency = gt->_gt->clock_frequency; >> + gt->clock_period_ns = gt->_gt->clock_period_ns; >> } >> >> /* >> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h >> index 55a143b92d10e..d8a8b3d529e09 100644 >> --- a/drivers/gpu/drm/i915/i915_gpu_error.h >> +++ b/drivers/gpu/drm/i915/i915_gpu_error.h >> @@ -150,6 +150,8 @@ struct intel_gt_coredump { >> u32 gtt_cache; >> u32 aux_err; /* gen12 */ >> u32 gam_done; /* gen12 */ >> + u32 clock_frequency; >> + u32 clock_period_ns; >> >> /* Display related */ >> u32 derrmr; >> @@ -164,6 +166,7 @@ struct intel_gt_coredump { >> struct intel_uc_fw guc_fw; >> struct intel_uc_fw huc_fw; >> struct i915_vma_coredump *guc_log; >> + u32 timestamp; >> bool is_guc_capture; >> } *uc;
On Fri, 19 Aug 2022, John Harrison <john.c.harrison@intel.com> wrote: > On 8/19/2022 03:45, Jani Nikula wrote: >> On Wed, 27 Jul 2022, John.C.Harrison@Intel.com wrote: >>> From: John Harrison <John.C.Harrison@Intel.com> >>> >>> It is useful to be able to match GuC events to kernel events when >>> looking at the GuC log. That requires being able to convert GuC >>> timestamps to kernel time. So, when dumping error captures and/or GuC >>> logs, include a stamp in both time zones plus the clock frequency. >>> >>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> >>> --- >>> drivers/gpu/drm/i915/gt/intel_gt_regs.h | 2 ++ >>> drivers/gpu/drm/i915/gt/uc/intel_guc.c | 19 +++++++++++++++++++ >>> drivers/gpu/drm/i915/gt/uc/intel_guc.h | 2 ++ >>> drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 2 ++ >>> drivers/gpu/drm/i915/i915_gpu_error.c | 12 ++++++++++++ >>> drivers/gpu/drm/i915/i915_gpu_error.h | 3 +++ >>> 6 files changed, 40 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h >>> index 60d6eb5f245b7..fc7979bd91db5 100644 >>> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h >>> @@ -1007,6 +1007,8 @@ >>> #define GEN11_LSN_UNSLCVC_GAFS_HALF_CL2_MAXALLOC (1 << 9) >>> #define GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC (1 << 7) >>> >>> +#define GUCPMTIMESTAMP _MMIO(0xc3e8) >>> + >>> #define __GEN9_RCS0_MOCS0 0xc800 >>> #define GEN9_GFX_MOCS(i) _MMIO(__GEN9_RCS0_MOCS0 + (i) * 4) >>> #define __GEN9_VCS0_MOCS0 0xc900 >>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c >>> index 2706a8c650900..ab4aacc516aa4 100644 >>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c >>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c >>> @@ -389,6 +389,25 @@ void intel_guc_write_params(struct intel_guc *guc) >>> intel_uncore_forcewake_put(uncore, FORCEWAKE_GT); >>> } >>> >>> +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p) >>> +{ >>> + struct intel_gt *gt = guc_to_gt(guc); >>> + intel_wakeref_t wakeref; >>> + u32 stamp = 0; >>> + u64 ktime; >>> + >>> + intel_device_info_print_runtime(RUNTIME_INFO(gt->i915), p); >> Why does this function print runtime info alone? Seems kind of random, >> considering what intel_device_info_print_runtime() actually >> prints. Should it print both device info and runtime info, or nothing at >> all? > Hmm. That was added specifically to print the rawclk value (and only for > the rawclk value) because that was the frequency that the GuC time stamp > register ticks at. I think we later worked out that the CS frequency was > more correct. Hence printing gt->clock_frequency lower down. I guess I > forgot to go back and remove the rawclk print, though. > > So yeah, it can be removed. Could you r-b the patch [1] doing just that please? BR, Jani. [1] https://patchwork.freedesktop.org/patch/msgid/b395ac4c909042f5daabf29959d8733993545aa2.1660910433.git.jani.nikula@intel.com
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 60d6eb5f245b7..fc7979bd91db5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -1007,6 +1007,8 @@ #define GEN11_LSN_UNSLCVC_GAFS_HALF_CL2_MAXALLOC (1 << 9) #define GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC (1 << 7) +#define GUCPMTIMESTAMP _MMIO(0xc3e8) + #define __GEN9_RCS0_MOCS0 0xc800 #define GEN9_GFX_MOCS(i) _MMIO(__GEN9_RCS0_MOCS0 + (i) * 4) #define __GEN9_VCS0_MOCS0 0xc900 diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 2706a8c650900..ab4aacc516aa4 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -389,6 +389,25 @@ void intel_guc_write_params(struct intel_guc *guc) intel_uncore_forcewake_put(uncore, FORCEWAKE_GT); } +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p) +{ + struct intel_gt *gt = guc_to_gt(guc); + intel_wakeref_t wakeref; + u32 stamp = 0; + u64 ktime; + + intel_device_info_print_runtime(RUNTIME_INFO(gt->i915), p); + + with_intel_runtime_pm(>->i915->runtime_pm, wakeref) + stamp = intel_uncore_read(gt->uncore, GUCPMTIMESTAMP); + ktime = ktime_get_boottime_ns(); + + drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", ktime, ktime); + drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", stamp, stamp); + drm_printf(p, "CS timestamp frequency: %u Hz, %u ns\n", + gt->clock_frequency, gt->clock_period_ns); +} + int intel_guc_init(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index a7acffbf15d1f..804133df1ac9b 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -464,4 +464,6 @@ void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p); void intel_guc_write_barrier(struct intel_guc *guc); +void intel_guc_dump_time_info(struct intel_guc *guc, struct drm_printer *p); + #endif diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index 991d4a02248dc..07d31ae32f765 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -764,6 +764,8 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p, if (!obj) return 0; + intel_guc_dump_time_info(guc, p); + map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(map)) { DRM_DEBUG("Failed to pin object\n"); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 32e92651ef7c2..addba75252343 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -678,6 +678,7 @@ static void err_print_uc(struct drm_i915_error_state_buf *m, intel_uc_fw_dump(&error_uc->guc_fw, &p); intel_uc_fw_dump(&error_uc->huc_fw, &p); + err_printf(m, "GuC timestamp: 0x%08x\n", error_uc->timestamp); intel_gpu_error_print_vma(m, NULL, error_uc->guc_log); } @@ -720,6 +721,8 @@ static void err_print_gt_global_nonguc(struct drm_i915_error_state_buf *m, int i; err_printf(m, "GT awake: %s\n", str_yes_no(gt->awake)); + err_printf(m, "CS timestamp frequency: %u Hz, %d ns\n", + gt->clock_frequency, gt->clock_period_ns); err_printf(m, "EIR: 0x%08x\n", gt->eir); err_printf(m, "PGTBL_ER: 0x%08x\n", gt->pgtbl_er); @@ -1675,6 +1678,13 @@ gt_record_uc(struct intel_gt_coredump *gt, */ error_uc->guc_fw.path = kstrdup(uc->guc.fw.path, ALLOW_FAIL); error_uc->huc_fw.path = kstrdup(uc->huc.fw.path, ALLOW_FAIL); + + /* + * Save the GuC log and include a timestamp reference for converting the + * log times to system times (in conjunction with the error->boottime and + * gt->clock_frequency fields saved elsewhere). + */ + error_uc->timestamp = intel_uncore_read(gt->_gt->uncore, GUCPMTIMESTAMP); error_uc->guc_log = create_vma_coredump(gt->_gt, uc->guc.log.vma, "GuC log buffer", compress); @@ -1833,6 +1843,8 @@ static void gt_record_global_regs(struct intel_gt_coredump *gt) static void gt_record_info(struct intel_gt_coredump *gt) { memcpy(>->info, >->_gt->info, sizeof(struct intel_gt_info)); + gt->clock_frequency = gt->_gt->clock_frequency; + gt->clock_period_ns = gt->_gt->clock_period_ns; } /* diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h index 55a143b92d10e..d8a8b3d529e09 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.h +++ b/drivers/gpu/drm/i915/i915_gpu_error.h @@ -150,6 +150,8 @@ struct intel_gt_coredump { u32 gtt_cache; u32 aux_err; /* gen12 */ u32 gam_done; /* gen12 */ + u32 clock_frequency; + u32 clock_period_ns; /* Display related */ u32 derrmr; @@ -164,6 +166,7 @@ struct intel_gt_coredump { struct intel_uc_fw guc_fw; struct intel_uc_fw huc_fw; struct i915_vma_coredump *guc_log; + u32 timestamp; bool is_guc_capture; } *uc;