diff mbox series

[RESEND,v2] drm/i915/gt: Log reason for setting TAINT_WARN at reset

Message ID 20241220131714.1309483-1-andi.shyti@linux.intel.com (mailing list archive)
State New
Headers show
Series [RESEND,v2] drm/i915/gt: Log reason for setting TAINT_WARN at reset | expand

Commit Message

Andi Shyti Dec. 20, 2024, 1:17 p.m. UTC
From: Sebastian Brzezinka <sebastian.brzezinka@intel.com>

TAINT_WARN is used to notify CI about non-recoverable failures, which
require device to be restarted. In some cases, there is no sufficient
information about the reason for the restart. The test runner is just
killed, and DUT is rebooted, logging only 'probe with driver i915 failed
with error -4' to dmesg.

Printing error to dmesg before TAINT_WARN, would explain why the device
has been restarted, and what caused the malfunction in the first place.

Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
Cc: Andi Shyti <andi.shyti@kernel.org>

Hi,

this patch for some reason did not reach the mailing list and it
missed all the CI premerge tests. I am resending it, this time
with the Changelog and the versioning.

I am leaving it for a few days in order to be reviewed by others,
as well.

Andi

Changelog:
==========
v1 -> v2:
 - Reword the commit log

 drivers/gpu/drm/i915/gt/intel_reset.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Rodrigo Vivi Dec. 20, 2024, 1:34 p.m. UTC | #1
On Fri, Dec 20, 2024 at 02:17:14PM +0100, Andi Shyti wrote:
> From: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
> 
> TAINT_WARN is used to notify CI about non-recoverable failures, which
> require device to be restarted. In some cases, there is no sufficient
> information about the reason for the restart. The test runner is just
> killed, and DUT is rebooted, logging only 'probe with driver i915 failed
> with error -4' to dmesg.
> 
> Printing error to dmesg before TAINT_WARN, would explain why the device
> has been restarted, and what caused the malfunction in the first place.
> 
> Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> ---
> Cc: Andi Shyti <andi.shyti@kernel.org>
> 
> Hi,
> 
> this patch for some reason did not reach the mailing list and it
> missed all the CI premerge tests. I am resending it, this time
> with the Changelog and the versioning.


Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> 
> I am leaving it for a few days in order to be reviewed by others,
> as well.
> 
> Andi
> 
> Changelog:
> ==========
> v1 -> v2:
>  - Reword the commit log
> 
>  drivers/gpu/drm/i915/gt/intel_reset.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index c2fe3fc78e76..aae5a081cb53 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1113,6 +1113,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>  		 * Warn CI about the unrecoverable wedged condition.
>  		 * Time for a reboot.
>  		 */
> +		gt_err(gt, "Unrecoverable wedged condition\n");
>  		add_taint_for_CI(gt->i915, TAINT_WARN);
>  		return false;
>  	}
> @@ -1264,8 +1265,10 @@ void intel_gt_reset(struct intel_gt *gt,
>  	}
>  
>  	ret = resume(gt);
> -	if (ret)
> +	if (ret) {
> +		gt_err(gt, "Failed to resume (%d)\n", ret);
>  		goto taint;
> +	}
>  
>  finish:
>  	reset_finish(gt, awake);
> @@ -1608,6 +1611,7 @@ void intel_gt_set_wedged_on_init(struct intel_gt *gt)
>  	set_bit(I915_WEDGED_ON_INIT, &gt->reset.flags);
>  
>  	/* Wedged on init is non-recoverable */
> +	gt_err(gt, "Non-recoverable wedged on init\n");
>  	add_taint_for_CI(gt->i915, TAINT_WARN);
>  }
>  
> -- 
> 2.45.2
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index c2fe3fc78e76..aae5a081cb53 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1113,6 +1113,7 @@  static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 		 * Warn CI about the unrecoverable wedged condition.
 		 * Time for a reboot.
 		 */
+		gt_err(gt, "Unrecoverable wedged condition\n");
 		add_taint_for_CI(gt->i915, TAINT_WARN);
 		return false;
 	}
@@ -1264,8 +1265,10 @@  void intel_gt_reset(struct intel_gt *gt,
 	}
 
 	ret = resume(gt);
-	if (ret)
+	if (ret) {
+		gt_err(gt, "Failed to resume (%d)\n", ret);
 		goto taint;
+	}
 
 finish:
 	reset_finish(gt, awake);
@@ -1608,6 +1611,7 @@  void intel_gt_set_wedged_on_init(struct intel_gt *gt)
 	set_bit(I915_WEDGED_ON_INIT, &gt->reset.flags);
 
 	/* Wedged on init is non-recoverable */
+	gt_err(gt, "Non-recoverable wedged on init\n");
 	add_taint_for_CI(gt->i915, TAINT_WARN);
 }