diff mbox series

drm/i915/gt: Reduce log severity on reset prepare.

Message ID 20231201154443.16660-1-nirmoy.das@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/gt: Reduce log severity on reset prepare. | expand

Commit Message

Nirmoy Das Dec. 1, 2023, 3:44 p.m. UTC
gen8_engine_reset_prepare() can fail when HW fails to set
RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
error as driver will retry.

Let the caller of gen8_engine_reset_prepare() decide if a
failure in gen8_engine_reset_prepare is an error or not.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Andi Shyti Dec. 4, 2023, 7:52 p.m. UTC | #1
Hi Nirmoy,

On Fri, Dec 01, 2023 at 04:44:43PM +0100, Nirmoy Das wrote:
> gen8_engine_reset_prepare() can fail when HW fails to set
> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
> error as driver will retry.
> 
> Let the caller of gen8_engine_reset_prepare() decide if a
> failure in gen8_engine_reset_prepare is an error or not.
> 
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>

Thanks,
Andi
Andrzej Hajda Dec. 5, 2023, 6:58 a.m. UTC | #2
On 01.12.2023 16:44, Nirmoy Das wrote:
> gen8_engine_reset_prepare() can fail when HW fails to set
> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
> error as driver will retry.
>
> Let the caller of gen8_engine_reset_prepare() decide if a
> failure in gen8_engine_reset_prepare is an error or not.
>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
> ---

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej

>   drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index d5ed904f355d..e6fbc6202c80 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct intel_engine_cs *engine)
>   	ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
>   					   700, 0, NULL);
>   	if (ret)
> -		gt_err(engine->gt,
> -		       "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
> -		       engine->name, request,
> -		       intel_uncore_read_fw(uncore, reg));
> +		GT_TRACE(engine->gt,
> +			 "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
> +			 engine->name, request,
> +			 intel_uncore_read_fw(uncore, reg));
>   
>   	return ret;
>   }
Tvrtko Ursulin Dec. 5, 2023, 8:34 a.m. UTC | #3
On 01/12/2023 15:44, Nirmoy Das wrote:
> gen8_engine_reset_prepare() can fail when HW fails to set
> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
> error as driver will retry.
> 
> Let the caller of gen8_engine_reset_prepare() decide if a
> failure in gen8_engine_reset_prepare is an error or not.

No complaints per se but I don't see the caller deciding and it is not 
really reducing log level but converting to trace. So commit message and 
patch do not align for me which I think should be improved.

Regards,

Tvrtko

> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Cc: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index d5ed904f355d..e6fbc6202c80 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct intel_engine_cs *engine)
>   	ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
>   					   700, 0, NULL);
>   	if (ret)
> -		gt_err(engine->gt,
> -		       "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
> -		       engine->name, request,
> -		       intel_uncore_read_fw(uncore, reg));
> +		GT_TRACE(engine->gt,
> +			 "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
> +			 engine->name, request,
> +			 intel_uncore_read_fw(uncore, reg));
>   
>   	return ret;
>   }
Nirmoy Das Dec. 5, 2023, 8:50 a.m. UTC | #4
Hi Tvrtko,

On 12/5/2023 9:34 AM, Tvrtko Ursulin wrote:
>
> On 01/12/2023 15:44, Nirmoy Das wrote:
>> gen8_engine_reset_prepare() can fail when HW fails to set
>> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
>> error as driver will retry.
>>
>> Let the caller of gen8_engine_reset_prepare() decide if a
>> failure in gen8_engine_reset_prepare is an error or not.
>
> No complaints per se but I don't see the caller deciding and it is not 
> really reducing log level but converting to trace. So commit message 
> and patch do not align for me which I think should be improved.


I meant the return value is checked by the caller, gen8_reset_engines(). 
I will resend with a improved commit message.

Thanks,

Nirmoy

>
> Regards,
>
> Tvrtko
>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: John Harrison <John.C.Harrison@Intel.com>
>> Cc: Andi Shyti <andi.shyti@linux.intel.com>
>> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
>> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>> index d5ed904f355d..e6fbc6202c80 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>> @@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct 
>> intel_engine_cs *engine)
>>       ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
>>                          700, 0, NULL);
>>       if (ret)
>> -        gt_err(engine->gt,
>> -               "%s reset request timed out: {request: %08x, 
>> RESET_CTL: %08x}\n",
>> -               engine->name, request,
>> -               intel_uncore_read_fw(uncore, reg));
>> +        GT_TRACE(engine->gt,
>> +             "%s reset request timed out: {request: %08x, RESET_CTL: 
>> %08x}\n",
>> +             engine->name, request,
>> +             intel_uncore_read_fw(uncore, reg));
>>         return ret;
>>   }
Tvrtko Ursulin Dec. 5, 2023, 10:05 a.m. UTC | #5
On 05/12/2023 08:50, Nirmoy Das wrote:
> Hi Tvrtko,
> 
> On 12/5/2023 9:34 AM, Tvrtko Ursulin wrote:
>>
>> On 01/12/2023 15:44, Nirmoy Das wrote:
>>> gen8_engine_reset_prepare() can fail when HW fails to set
>>> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
>>> error as driver will retry.
>>>
>>> Let the caller of gen8_engine_reset_prepare() decide if a
>>> failure in gen8_engine_reset_prepare is an error or not.
>>
>> No complaints per se but I don't see the caller deciding and it is not 
>> really reducing log level but converting to trace. So commit message 
>> and patch do not align for me which I think should be improved.
> 
> 
> I meant the return value is checked by the caller, gen8_reset_engines(). 
> I will resend with a improved commit message.

Ah okay, maybe my bad for not figuring out that possibility. I guess it 
might be passable as is, but yes, clearer commit text would be better.

Trace is good enough - we are not usually interested in seeing those as 
dbg/info/notice?

Regards,

Tvrtko

> 
> Thanks,
> 
> Nirmoy
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>> Cc: Andi Shyti <andi.shyti@linux.intel.com>
>>> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
>>> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
>>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> index d5ed904f355d..e6fbc6202c80 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> @@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct 
>>> intel_engine_cs *engine)
>>>       ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
>>>                          700, 0, NULL);
>>>       if (ret)
>>> -        gt_err(engine->gt,
>>> -               "%s reset request timed out: {request: %08x, 
>>> RESET_CTL: %08x}\n",
>>> -               engine->name, request,
>>> -               intel_uncore_read_fw(uncore, reg));
>>> +        GT_TRACE(engine->gt,
>>> +             "%s reset request timed out: {request: %08x, RESET_CTL: 
>>> %08x}\n",
>>> +             engine->name, request,
>>> +             intel_uncore_read_fw(uncore, reg));
>>>         return ret;
>>>   }
Nirmoy Das Dec. 5, 2023, 10:44 a.m. UTC | #6
Hi Tvrtko,

On 12/5/2023 11:05 AM, Tvrtko Ursulin wrote:
>
> On 05/12/2023 08:50, Nirmoy Das wrote:
>> Hi Tvrtko,
>>
>> On 12/5/2023 9:34 AM, Tvrtko Ursulin wrote:
>>>
>>> On 01/12/2023 15:44, Nirmoy Das wrote:
>>>> gen8_engine_reset_prepare() can fail when HW fails to set
>>>> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
>>>> error as driver will retry.
>>>>
>>>> Let the caller of gen8_engine_reset_prepare() decide if a
>>>> failure in gen8_engine_reset_prepare is an error or not.
>>>
>>> No complaints per se but I don't see the caller deciding and it is 
>>> not really reducing log level but converting to trace. So commit 
>>> message and patch do not align for me which I think should be improved.
>>
>>
>> I meant the return value is checked by the caller, 
>> gen8_reset_engines(). I will resend with a improved commit message.
>
> Ah okay, maybe my bad for not figuring out that possibility. I guess 
> it might be passable as is, but yes, clearer commit text would be better.

I sent a v2 already :)


>
> Trace is good enough - we are not usually interested in seeing those 
> as dbg/info/notice?


Idea is that all the GT related events are recorded in trace and dmesg 
could be noisy some times.


Regards,

Nirmoy

>
> Regards,
>
> Tvrtko
>
>>
>> Thanks,
>>
>> Nirmoy
>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>>> Cc: Andi Shyti <andi.shyti@linux.intel.com>
>>>> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
>>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
>>>> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
>>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
>>>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>>>> index d5ed904f355d..e6fbc6202c80 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>>>> @@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct 
>>>> intel_engine_cs *engine)
>>>>       ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
>>>>                          700, 0, NULL);
>>>>       if (ret)
>>>> -        gt_err(engine->gt,
>>>> -               "%s reset request timed out: {request: %08x, 
>>>> RESET_CTL: %08x}\n",
>>>> -               engine->name, request,
>>>> -               intel_uncore_read_fw(uncore, reg));
>>>> +        GT_TRACE(engine->gt,
>>>> +             "%s reset request timed out: {request: %08x, 
>>>> RESET_CTL: %08x}\n",
>>>> +             engine->name, request,
>>>> +             intel_uncore_read_fw(uncore, reg));
>>>>         return ret;
>>>>   }
Tvrtko Ursulin Dec. 5, 2023, 10:52 a.m. UTC | #7
On 05/12/2023 10:44, Nirmoy Das wrote:
> Hi Tvrtko,
> 
> On 12/5/2023 11:05 AM, Tvrtko Ursulin wrote:
>>
>> On 05/12/2023 08:50, Nirmoy Das wrote:
>>> Hi Tvrtko,
>>>
>>> On 12/5/2023 9:34 AM, Tvrtko Ursulin wrote:
>>>>
>>>> On 01/12/2023 15:44, Nirmoy Das wrote:
>>>>> gen8_engine_reset_prepare() can fail when HW fails to set
>>>>> RESET_CTL_READY_TO_RESET bit. In some cases this is not fatal
>>>>> error as driver will retry.
>>>>>
>>>>> Let the caller of gen8_engine_reset_prepare() decide if a
>>>>> failure in gen8_engine_reset_prepare is an error or not.
>>>>
>>>> No complaints per se but I don't see the caller deciding and it is 
>>>> not really reducing log level but converting to trace. So commit 
>>>> message and patch do not align for me which I think should be improved.
>>>
>>>
>>> I meant the return value is checked by the caller, 
>>> gen8_reset_engines(). I will resend with a improved commit message.
>>
>> Ah okay, maybe my bad for not figuring out that possibility. I guess 
>> it might be passable as is, but yes, clearer commit text would be better.
> 
> I sent a v2 already :)
> 
> 
>>
>> Trace is good enough - we are not usually interested in seeing those 
>> as dbg/info/notice?
> 
> 
> Idea is that all the GT related events are recorded in trace and dmesg 
> could be noisy some times.

Although trace does not help on production deployments so we need to be 
sure the fact this timeout is hit is totally un-interesting. I see John 
has some concerns that it may not be so. And I don't have currently a 
view into how frequent they are (timeouts) or which platforms are affected.

Regards,

Tvrtko

> 
> 
> Regards,
> 
> Nirmoy
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>> Thanks,
>>>
>>> Nirmoy
>>>
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>>>> Cc: Andi Shyti <andi.shyti@linux.intel.com>
>>>>> Cc: Andrzej Hajda <andrzej.hajda@intel.com>
>>>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5591
>>>>> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/gt/intel_reset.c | 8 ++++----
>>>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
>>>>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>>>>> index d5ed904f355d..e6fbc6202c80 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>>>>> @@ -593,10 +593,10 @@ static int gen8_engine_reset_prepare(struct 
>>>>> intel_engine_cs *engine)
>>>>>       ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
>>>>>                          700, 0, NULL);
>>>>>       if (ret)
>>>>> -        gt_err(engine->gt,
>>>>> -               "%s reset request timed out: {request: %08x, 
>>>>> RESET_CTL: %08x}\n",
>>>>> -               engine->name, request,
>>>>> -               intel_uncore_read_fw(uncore, reg));
>>>>> +        GT_TRACE(engine->gt,
>>>>> +             "%s reset request timed out: {request: %08x, 
>>>>> RESET_CTL: %08x}\n",
>>>>> +             engine->name, request,
>>>>> +             intel_uncore_read_fw(uncore, reg));
>>>>>         return ret;
>>>>>   }
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index d5ed904f355d..e6fbc6202c80 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -593,10 +593,10 @@  static int gen8_engine_reset_prepare(struct intel_engine_cs *engine)
 	ret = __intel_wait_for_register_fw(uncore, reg, mask, ack,
 					   700, 0, NULL);
 	if (ret)
-		gt_err(engine->gt,
-		       "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
-		       engine->name, request,
-		       intel_uncore_read_fw(uncore, reg));
+		GT_TRACE(engine->gt,
+			 "%s reset request timed out: {request: %08x, RESET_CTL: %08x}\n",
+			 engine->name, request,
+			 intel_uncore_read_fw(uncore, reg));
 
 	return ret;
 }