Message ID | 20220226095541.1010534-10-alan.previn.teres.alexis@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add GuC Error Capture Support | expand |
Hi Alan, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm-tip/drm-tip] [cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next drm/drm-next tegra-drm/drm/tegra/for-next airlied/drm-next v5.17-rc6 next-20220228] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Alan-Previn/Add-GuC-Error-Capture-Support/20220226-175600 base: git://anongit.freedesktop.org/drm/drm-tip drm-tip config: i386-allyesconfig (https://download.01.org/0day-ci/archive/20220301/202203010622.2JyDEoHX-lkp@intel.com/config) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project d271fc04d5b97b12e6b797c6067d3c96a8d7470e) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/4c41838b35d9a5c0bcb4380e0064cb2d5d33661f git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Alan-Previn/Add-GuC-Error-Capture-Support/20220226-175600 git checkout 4c41838b35d9a5c0bcb4380e0064cb2d5d33661f # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:555:1: error: no previous prototype for function 'intel_guc_capture_getlistsize' [-Werror,-Wmissing-prototypes] intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 classid, ^ drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:554:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int ^ static drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:585:1: error: no previous prototype for function 'intel_guc_capture_getlist' [-Werror,-Wmissing-prototypes] intel_guc_capture_getlist(struct intel_guc *guc, u32 owner, u32 type, u32 classid, ^ drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:584:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int ^ static >> drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:648:5: error: no previous prototype for function 'intel_guc_capture_output_min_size_est' [-Werror,-Wmissing-prototypes] int intel_guc_capture_output_min_size_est(struct intel_guc *guc) ^ drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:648:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int intel_guc_capture_output_min_size_est(struct intel_guc *guc) ^ static drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:711:6: error: no previous prototype for function 'intel_guc_capture_destroy' [-Werror,-Wmissing-prototypes] void intel_guc_capture_destroy(struct intel_guc *guc) ^ drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:711:1: note: declare 'static' if the function is not intended to be used outside of this translation unit void intel_guc_capture_destroy(struct intel_guc *guc) ^ static drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:727:5: error: no previous prototype for function 'intel_guc_capture_init' [-Werror,-Wmissing-prototypes] int intel_guc_capture_init(struct intel_guc *guc) ^ drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c:727:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int intel_guc_capture_init(struct intel_guc *guc) ^ static 5 errors generated. vim +/intel_guc_capture_output_min_size_est +648 drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c 646 647 #define GUC_CAPTURE_OVERBUFFER_MULTIPLIER 3 > 648 int intel_guc_capture_output_min_size_est(struct intel_guc *guc) 649 { 650 struct intel_gt *gt = guc_to_gt(guc); 651 struct intel_engine_cs *engine; 652 enum intel_engine_id id; 653 int worst_min_size = 0, num_regs = 0; 654 size_t tmp = 0; 655 656 /* 657 * If every single engine-instance suffered a failure in quick succession but 658 * were all unrelated, then a burst of multiple error-capture events would dump 659 * registers for every one engine instance, one at a time. In this case, GuC 660 * would even dump the global-registers repeatedly. 661 * 662 * For each engine instance, there would be 1 x guc_state_capture_group_t output 663 * followed by 3 x guc_state_capture_t lists. The latter is how the register 664 * dumps are split across different register types (where the '3' are global vs class 665 * vs instance). Finally, let's multiply the whole thing by 3x (just so we are 666 * not limited to just 1 round of data in a worst case full register dump log) 667 * 668 * NOTE: intel_guc_log that allocates the log buffer would round this size up to 669 * a power of two. 670 */ 671 672 for_each_engine(engine, gt, id) { 673 worst_min_size += sizeof(struct guc_state_capture_group_header_t) + 674 (3 * sizeof(struct guc_state_capture_header_t)); 675 676 if (!intel_guc_capture_getlistsize(guc, 0, GUC_CAPTURE_LIST_TYPE_GLOBAL, 0, &tmp)) 677 num_regs += tmp; 678 679 if (!intel_guc_capture_getlistsize(guc, 0, GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS, 680 engine->class, &tmp)) { 681 num_regs += tmp; 682 } 683 if (!intel_guc_capture_getlistsize(guc, 0, GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE, 684 engine->class, &tmp)) { 685 num_regs += tmp; 686 } 687 } 688 689 worst_min_size += (num_regs * sizeof(struct guc_mmio_reg)); 690 691 return (worst_min_size * GUC_CAPTURE_OVERBUFFER_MULTIPLIER); 692 } 693 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c index eb22f979d720..ed78995bcc35 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c @@ -644,6 +644,53 @@ intel_guc_capture_getlist(struct intel_guc *guc, u32 owner, u32 type, u32 classi return 0; } +#define GUC_CAPTURE_OVERBUFFER_MULTIPLIER 3 +int intel_guc_capture_output_min_size_est(struct intel_guc *guc) +{ + struct intel_gt *gt = guc_to_gt(guc); + struct intel_engine_cs *engine; + enum intel_engine_id id; + int worst_min_size = 0, num_regs = 0; + size_t tmp = 0; + + /* + * If every single engine-instance suffered a failure in quick succession but + * were all unrelated, then a burst of multiple error-capture events would dump + * registers for every one engine instance, one at a time. In this case, GuC + * would even dump the global-registers repeatedly. + * + * For each engine instance, there would be 1 x guc_state_capture_group_t output + * followed by 3 x guc_state_capture_t lists. The latter is how the register + * dumps are split across different register types (where the '3' are global vs class + * vs instance). Finally, let's multiply the whole thing by 3x (just so we are + * not limited to just 1 round of data in a worst case full register dump log) + * + * NOTE: intel_guc_log that allocates the log buffer would round this size up to + * a power of two. + */ + + for_each_engine(engine, gt, id) { + worst_min_size += sizeof(struct guc_state_capture_group_header_t) + + (3 * sizeof(struct guc_state_capture_header_t)); + + if (!intel_guc_capture_getlistsize(guc, 0, GUC_CAPTURE_LIST_TYPE_GLOBAL, 0, &tmp)) + num_regs += tmp; + + if (!intel_guc_capture_getlistsize(guc, 0, GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS, + engine->class, &tmp)) { + num_regs += tmp; + } + if (!intel_guc_capture_getlistsize(guc, 0, GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE, + engine->class, &tmp)) { + num_regs += tmp; + } + } + + worst_min_size += (num_regs * sizeof(struct guc_mmio_reg)); + + return (worst_min_size * GUC_CAPTURE_OVERBUFFER_MULTIPLIER); +} + static void guc_capture_free_ads_cache(struct __guc_state_capture_priv *gc) { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h index f05365239a2f..24a11f33f7d9 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h @@ -12,6 +12,7 @@ struct file; struct guc_gt_system_info; struct intel_guc; +int intel_guc_capture_output_min_size_est(struct intel_guc *guc); int intel_guc_capture_getlist(struct intel_guc *guc, u32 owner, u32 type, u32 classid, struct file **fileptr); int intel_guc_capture_getlistsize(struct intel_guc *guc, u32 owner, u32 type, u32 classid, diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index 2cc52f1eedf3..e9a865c2f4cb 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -6,10 +6,11 @@ #include <linux/debugfs.h> #include "gt/intel_gt.h" +#include "intel_guc_capture.h" +#include "intel_guc_log.h" #include "i915_drv.h" #include "i915_irq.h" #include "i915_memcpy.h" -#include "intel_guc_log.h" static void guc_log_copy_debuglogs_for_relay(struct intel_guc_log *log); @@ -464,6 +465,10 @@ int intel_guc_log_create(struct intel_guc_log *log) * | Capture logs | * +===============================+ + CAPTURE_SIZE */ + if (intel_guc_capture_output_min_size_est(guc) > CAPTURE_BUFFER_SIZE) + DRM_WARN("GuC log buffer for state_capture maybe too small. %d < %d\n", + CAPTURE_BUFFER_SIZE, intel_guc_capture_output_min_size_est(guc)); + guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE + CAPTURE_BUFFER_SIZE;
Add intel_guc_capture_output_min_size_est function to provide a reasonable minimum size for error-capture region before allocating the shared buffer. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> --- .../gpu/drm/i915/gt/uc/intel_guc_capture.c | 47 +++++++++++++++++++ .../gpu/drm/i915/gt/uc/intel_guc_capture.h | 1 + drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 7 ++- 3 files changed, 54 insertions(+), 1 deletion(-)