diff mbox series

[5/7] drm/i915/guc: Add extra debug on CT deadlock

Message ID 20211211005612.8575-6-matthew.brost@intel.com (mailing list archive)
State New, archived
Headers show
Series Fix stealing guc_ids + test | expand

Commit Message

Matthew Brost Dec. 11, 2021, 12:56 a.m. UTC
Print CT state (H2G + G2H head / tail pointers, credits) on CT
deadlock.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

John Harrison Dec. 11, 2021, 1:43 a.m. UTC | #1
On 12/10/2021 16:56, Matthew Brost wrote:
> Print CT state (H2G + G2H head / tail pointers, credits) on CT
> deadlock.
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index a0cc34be7b56..ee5525c6f79b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -523,6 +523,15 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
>   		CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n",
>   			 ktime_ms_delta(ktime_get(), ct->stall_time),
>   			 send->status, recv->status);
> +		CT_ERROR(ct, "H2G Space: %u\n",
> +			 atomic_read(&ct->ctbs.send.space) * 4);
> +		CT_ERROR(ct, "Head: %u\n", ct->ctbs.send.desc->head);
> +		CT_ERROR(ct, "Tail: %u\n", ct->ctbs.send.desc->tail);
> +		CT_ERROR(ct, "G2H Space: %u\n",
> +			 atomic_read(&ct->ctbs.recv.space) * 4);
> +		CT_ERROR(ct, "Head: %u\n", ct->ctbs.recv.desc->head);
> +		CT_ERROR(ct, "Tail: %u\n", ct->ctbs.recv.desc->tail);
> +
>   		ct->ctbs.send.broken = true;
>   	}
>
John Harrison Dec. 11, 2021, 1:45 a.m. UTC | #2
On 12/10/2021 17:43, John Harrison wrote:
> On 12/10/2021 16:56, Matthew Brost wrote:
>> Print CT state (H2G + G2H head / tail pointers, credits) on CT
>> deadlock.
>>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
>
>> ---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> index a0cc34be7b56..ee5525c6f79b 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> @@ -523,6 +523,15 @@ static inline bool ct_deadlocked(struct 
>> intel_guc_ct *ct)
>>           CT_ERROR(ct, "Communication stalled for %lld ms, desc 
>> status=%#x,%#x\n",
>>                ktime_ms_delta(ktime_get(), ct->stall_time),
>>                send->status, recv->status);
>> +        CT_ERROR(ct, "H2G Space: %u\n",
>> +             atomic_read(&ct->ctbs.send.space) * 4);
>> +        CT_ERROR(ct, "Head: %u\n", ct->ctbs.send.desc->head);
>> +        CT_ERROR(ct, "Tail: %u\n", ct->ctbs.send.desc->tail);
Actually, aren't these offsets in dwords? So, scaling the dword space to 
bytes but leaving this as dwords would produce a confusing numbers.

John.

>> +        CT_ERROR(ct, "G2H Space: %u\n",
>> +             atomic_read(&ct->ctbs.recv.space) * 4);
>> +        CT_ERROR(ct, "Head: %u\n", ct->ctbs.recv.desc->head);
>> +        CT_ERROR(ct, "Tail: %u\n", ct->ctbs.recv.desc->tail);
>> +
>>           ct->ctbs.send.broken = true;
>>       }
>
Matthew Brost Dec. 11, 2021, 3:24 a.m. UTC | #3
On Fri, Dec 10, 2021 at 05:45:05PM -0800, John Harrison wrote:
> On 12/10/2021 17:43, John Harrison wrote:
> > On 12/10/2021 16:56, Matthew Brost wrote:
> > > Print CT state (H2G + G2H head / tail pointers, credits) on CT
> > > deadlock.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
> > 
> > > ---
> > >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 9 +++++++++
> > >   1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > index a0cc34be7b56..ee5525c6f79b 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > @@ -523,6 +523,15 @@ static inline bool ct_deadlocked(struct
> > > intel_guc_ct *ct)
> > >           CT_ERROR(ct, "Communication stalled for %lld ms, desc
> > > status=%#x,%#x\n",
> > >                ktime_ms_delta(ktime_get(), ct->stall_time),
> > >                send->status, recv->status);
> > > +        CT_ERROR(ct, "H2G Space: %u\n",
> > > +             atomic_read(&ct->ctbs.send.space) * 4);
> > > +        CT_ERROR(ct, "Head: %u\n", ct->ctbs.send.desc->head);
> > > +        CT_ERROR(ct, "Tail: %u\n", ct->ctbs.send.desc->tail);
> Actually, aren't these offsets in dwords? So, scaling the dword space to
> bytes but leaving this as dwords would produce a confusing numbers.
> 

Copy + pasted from CT debugfs but yes it is slightly confusing. I'd
rather leave the head / tail in native format but I'll add info to both
the space / pointers print indicating the units.

Matt

> John.
> 
> > > +        CT_ERROR(ct, "G2H Space: %u\n",
> > > +             atomic_read(&ct->ctbs.recv.space) * 4);
> > > +        CT_ERROR(ct, "Head: %u\n", ct->ctbs.recv.desc->head);
> > > +        CT_ERROR(ct, "Tail: %u\n", ct->ctbs.recv.desc->tail);
> > > +
> > >           ct->ctbs.send.broken = true;
> > >       }
> > 
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index a0cc34be7b56..ee5525c6f79b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -523,6 +523,15 @@  static inline bool ct_deadlocked(struct intel_guc_ct *ct)
 		CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n",
 			 ktime_ms_delta(ktime_get(), ct->stall_time),
 			 send->status, recv->status);
+		CT_ERROR(ct, "H2G Space: %u\n",
+			 atomic_read(&ct->ctbs.send.space) * 4);
+		CT_ERROR(ct, "Head: %u\n", ct->ctbs.send.desc->head);
+		CT_ERROR(ct, "Tail: %u\n", ct->ctbs.send.desc->tail);
+		CT_ERROR(ct, "G2H Space: %u\n",
+			 atomic_read(&ct->ctbs.recv.space) * 4);
+		CT_ERROR(ct, "Head: %u\n", ct->ctbs.recv.desc->head);
+		CT_ERROR(ct, "Tail: %u\n", ct->ctbs.recv.desc->tail);
+
 		ct->ctbs.send.broken = true;
 	}