diff mbox series

[10/12] drm/i915/guc: Support larger contexts on newer hardware

Message ID 20220712233136.1044951-11-John.C.Harrison@Intel.com (mailing list archive)
State New, archived
Headers show
Series Random assortment of (mostly) GuC related patches | expand

Commit Message

John Harrison July 12, 2022, 11:31 p.m. UTC
From: Matthew Brost <matthew.brost@intel.com>

The GuC needs a copy of a golden context for implementing watchdog
resets (aka media resets). This context is larger on newer platforms.
So adjust the size being allocated/copied accordingly.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Tvrtko Ursulin July 18, 2022, 12:35 p.m. UTC | #1
On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
> From: Matthew Brost <matthew.brost@intel.com>
> 
> The GuC needs a copy of a golden context for implementing watchdog
> resets (aka media resets). This context is larger on newer platforms.
> So adjust the size being allocated/copied accordingly.

What were the consequences of this being too small? Media watchdog reset 
broken impacting userspace? Platforms? Do we have an IGT testcase? Do we 
need a Fixes: tag? Copy stable?

Regards,

Tvrtko

> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>   1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index ba7541f3ca610..74cbe8eaf5318 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
>   }
>   
>   #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50) ? \
> +				    XEHP_LR_HW_CONTEXT_SIZE : \
> +				    LR_HW_CONTEXT_SIZE)
> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SZ(i915))
>   static int guc_prep_golden_context(struct intel_guc *guc)
>   {
>   	struct intel_gt *gt = guc_to_gt(guc);
> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct intel_guc *guc)
>   		 * on all engines).
>   		 */
>   		ads_blob_write(guc, ads.eng_state_size[guc_class],
> -			       real_size - LRC_SKIP_SIZE);
> +			       real_size - LRC_SKIP_SIZE(gt->i915));
>   		ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>   			       addr_ggtt);
>   
> @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct intel_guc *guc)
>   		}
>   
>   		GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
> -			   real_size - LRC_SKIP_SIZE);
> +			   real_size - LRC_SKIP_SIZE(gt->i915));
>   		GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt);
>   
>   		addr_ggtt += alloc_size;
John Harrison July 19, 2022, 12:13 a.m. UTC | #2
On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>
> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>> From: Matthew Brost <matthew.brost@intel.com>
>>
>> The GuC needs a copy of a golden context for implementing watchdog
>> resets (aka media resets). This context is larger on newer platforms.
>> So adjust the size being allocated/copied accordingly.
>
> What were the consequences of this being too small? Media watchdog 
> reset broken impacting userspace? Platforms? Do we have an IGT 
> testcase? Do we need a Fixes: tag? Copy stable?
Yes. Not sure if we have an IGT for the media watchdog. I recall writing 
something a long time back but I don't think it ever got merged due to 
push back that I don't recall right now. And no because it only affects 
DG2 onwards which is still forceprobed.

John.


>
> Regards,
>
> Tvrtko
>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>> index ba7541f3ca610..74cbe8eaf5318 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct 
>> intel_gt *gt,
>>   }
>>     #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
>> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
>> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
>> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= 
>> IP_VER(12, 50) ? \
>> +                    XEHP_LR_HW_CONTEXT_SIZE : \
>> +                    LR_HW_CONTEXT_SIZE)
>> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + 
>> LR_HW_CONTEXT_SZ(i915))
>>   static int guc_prep_golden_context(struct intel_guc *guc)
>>   {
>>       struct intel_gt *gt = guc_to_gt(guc);
>> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct 
>> intel_guc *guc)
>>            * on all engines).
>>            */
>>           ads_blob_write(guc, ads.eng_state_size[guc_class],
>> -                   real_size - LRC_SKIP_SIZE);
>> +                   real_size - LRC_SKIP_SIZE(gt->i915));
>>           ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>>                      addr_ggtt);
>>   @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct 
>> intel_guc *guc)
>>           }
>>             GEM_BUG_ON(ads_blob_read(guc, 
>> ads.eng_state_size[guc_class]) !=
>> -               real_size - LRC_SKIP_SIZE);
>> +               real_size - LRC_SKIP_SIZE(gt->i915));
>>           GEM_BUG_ON(ads_blob_read(guc, 
>> ads.golden_context_lrca[guc_class]) != addr_ggtt);
>>             addr_ggtt += alloc_size;
Tvrtko Ursulin July 19, 2022, 9:56 a.m. UTC | #3
On 19/07/2022 01:13, John Harrison wrote:
> On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>>
>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>> From: Matthew Brost <matthew.brost@intel.com>
>>>
>>> The GuC needs a copy of a golden context for implementing watchdog
>>> resets (aka media resets). This context is larger on newer platforms.
>>> So adjust the size being allocated/copied accordingly.
>>
>> What were the consequences of this being too small? Media watchdog 
>> reset broken impacting userspace? Platforms? Do we have an IGT 
>> testcase? Do we need a Fixes: tag? Copy stable?
> Yes. Not sure if we have an IGT for the media watchdog. I recall writing 
> something a long time back but I don't think it ever got merged due to 
> push back that I don't recall right now. And no because it only affects 
> DG2 onwards which is still forceprobed.

Right, hm, I don't know if the MBD SKU promise for DG2 relies on force 
probe removal or not. My impression certainly was that a bunch of uapi 
we recently merged made people happy in that respect - that we satisfied 
the commit to deliver that support with 5.19. Maybe I am wrong, or 
perhaps to err on the side of safety you could add the right Fixes: tag 
regardless? Pick some patch which enables GuC for DG2 if there isn't 
anything better I guess. Or you could check with James.

Regards,

Tvrtko

> John.
> 
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>> index ba7541f3ca610..74cbe8eaf5318 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct 
>>> intel_gt *gt,
>>>   }
>>>     #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
>>> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
>>> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
>>> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= 
>>> IP_VER(12, 50) ? \
>>> +                    XEHP_LR_HW_CONTEXT_SIZE : \
>>> +                    LR_HW_CONTEXT_SIZE)
>>> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + 
>>> LR_HW_CONTEXT_SZ(i915))
>>>   static int guc_prep_golden_context(struct intel_guc *guc)
>>>   {
>>>       struct intel_gt *gt = guc_to_gt(guc);
>>> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct 
>>> intel_guc *guc)
>>>            * on all engines).
>>>            */
>>>           ads_blob_write(guc, ads.eng_state_size[guc_class],
>>> -                   real_size - LRC_SKIP_SIZE);
>>> +                   real_size - LRC_SKIP_SIZE(gt->i915));
>>>           ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>>>                      addr_ggtt);
>>>   @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct 
>>> intel_guc *guc)
>>>           }
>>>             GEM_BUG_ON(ads_blob_read(guc, 
>>> ads.eng_state_size[guc_class]) !=
>>> -               real_size - LRC_SKIP_SIZE);
>>> +               real_size - LRC_SKIP_SIZE(gt->i915));
>>>           GEM_BUG_ON(ads_blob_read(guc, 
>>> ads.golden_context_lrca[guc_class]) != addr_ggtt);
>>>             addr_ggtt += alloc_size;
>
John Harrison July 22, 2022, 7:32 p.m. UTC | #4
On 7/19/2022 02:56, Tvrtko Ursulin wrote:
> On 19/07/2022 01:13, John Harrison wrote:
>> On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>>>
>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>
>>>> The GuC needs a copy of a golden context for implementing watchdog
>>>> resets (aka media resets). This context is larger on newer platforms.
>>>> So adjust the size being allocated/copied accordingly.
>>>
>>> What were the consequences of this being too small? Media watchdog 
>>> reset broken impacting userspace? Platforms? Do we have an IGT 
>>> testcase? Do we need a Fixes: tag? Copy stable?
>> Yes. Not sure if we have an IGT for the media watchdog. I recall 
>> writing something a long time back but I don't think it ever got 
>> merged due to push back that I don't recall right now. And no because 
>> it only affects DG2 onwards which is still forceprobed.
>
> Right, hm, I don't know if the MBD SKU promise for DG2 relies on force 
> probe removal or not. My impression certainly was that a bunch of uapi 
> we recently merged made people happy in that respect - that we 
> satisfied the commit to deliver that support with 5.19. Maybe I am 
> wrong, or perhaps to err on the side of safety you could add the right 
> Fixes: tag regardless? Pick some patch which enables GuC for DG2 if 
> there isn't anything better I guess. Or you could check with James.
Adding "Fixes: random patch that is actually irrelevant" seems like the 
wrong thing to do. This is not a bug fix. It is new platform support. 
And it is not the only thing required to support that new platform that 
is not currently in 5.19. E.g. DG2 requires at least GuC v70.4.2 to 
support some hardware w/a's. The guidance for that was to not add Fixes 
tags but to send a manual pull request once everything is ready.

John.


>
> Regards,
>
> Tvrtko
>
>> John.
>>
>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++++---
>>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>>> index ba7541f3ca610..74cbe8eaf5318 100644
>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
>>>> @@ -464,7 +464,11 @@ static void fill_engine_enable_masks(struct 
>>>> intel_gt *gt,
>>>>   }
>>>>     #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
>>>> -#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + 
>>>> LR_HW_CONTEXT_SIZE)
>>>> +#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
>>>> +#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= 
>>>> IP_VER(12, 50) ? \
>>>> +                    XEHP_LR_HW_CONTEXT_SIZE : \
>>>> +                    LR_HW_CONTEXT_SIZE)
>>>> +#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + 
>>>> LR_HW_CONTEXT_SZ(i915))
>>>>   static int guc_prep_golden_context(struct intel_guc *guc)
>>>>   {
>>>>       struct intel_gt *gt = guc_to_gt(guc);
>>>> @@ -525,7 +529,7 @@ static int guc_prep_golden_context(struct 
>>>> intel_guc *guc)
>>>>            * on all engines).
>>>>            */
>>>>           ads_blob_write(guc, ads.eng_state_size[guc_class],
>>>> -                   real_size - LRC_SKIP_SIZE);
>>>> +                   real_size - LRC_SKIP_SIZE(gt->i915));
>>>>           ads_blob_write(guc, ads.golden_context_lrca[guc_class],
>>>>                      addr_ggtt);
>>>>   @@ -599,7 +603,7 @@ static void guc_init_golden_context(struct 
>>>> intel_guc *guc)
>>>>           }
>>>>             GEM_BUG_ON(ads_blob_read(guc, 
>>>> ads.eng_state_size[guc_class]) !=
>>>> -               real_size - LRC_SKIP_SIZE);
>>>> +               real_size - LRC_SKIP_SIZE(gt->i915));
>>>>           GEM_BUG_ON(ads_blob_read(guc, 
>>>> ads.golden_context_lrca[guc_class]) != addr_ggtt);
>>>>             addr_ggtt += alloc_size;
>>
Tvrtko Ursulin July 25, 2022, 11:24 a.m. UTC | #5
On 22/07/2022 20:32, John Harrison wrote:
> On 7/19/2022 02:56, Tvrtko Ursulin wrote:
>> On 19/07/2022 01:13, John Harrison wrote:
>>> On 7/18/2022 05:35, Tvrtko Ursulin wrote:
>>>>
>>>> On 13/07/2022 00:31, John.C.Harrison@Intel.com wrote:
>>>>> From: Matthew Brost <matthew.brost@intel.com>
>>>>>
>>>>> The GuC needs a copy of a golden context for implementing watchdog
>>>>> resets (aka media resets). This context is larger on newer platforms.
>>>>> So adjust the size being allocated/copied accordingly.
>>>>
>>>> What were the consequences of this being too small? Media watchdog 
>>>> reset broken impacting userspace? Platforms? Do we have an IGT 
>>>> testcase? Do we need a Fixes: tag? Copy stable?
>>> Yes. Not sure if we have an IGT for the media watchdog. I recall 
>>> writing something a long time back but I don't think it ever got 
>>> merged due to push back that I don't recall right now. And no because 
>>> it only affects DG2 onwards which is still forceprobed.
>>
>> Right, hm, I don't know if the MBD SKU promise for DG2 relies on force 
>> probe removal or not. My impression certainly was that a bunch of uapi 
>> we recently merged made people happy in that respect - that we 
>> satisfied the commit to deliver that support with 5.19. Maybe I am 
>> wrong, or perhaps to err on the side of safety you could add the right 
>> Fixes: tag regardless? Pick some patch which enables GuC for DG2 if 
>> there isn't anything better I guess. Or you could check with James.
> Adding "Fixes: random patch that is actually irrelevant" seems like the 
> wrong thing to do. This is not a bug fix. It is new platform support. 
> And it is not the only thing required to support that new platform that 
> is not currently in 5.19. E.g. DG2 requires at least GuC v70.4.2 to 
> support some hardware w/a's. The guidance for that was to not add Fixes 
> tags but to send a manual pull request once everything is ready.

All I know is that some people were really interested(*) that 5.19 
contains everything needed for DG2. Hence I suggested to err on the side 
of safety, or at least check with folks.

Bottom line is, if you want this fix to be in 5.19, or even 5.20, you 
should add a Fixes: tag. Otherwise it will be in 5.21 at the earliest. 
Your call, I only tried to be helpful and avoid another failure.

Regards,

Tvrtko

*) To the point of actively pining the maintainers to ensure patches do 
not miss the merge window.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index ba7541f3ca610..74cbe8eaf5318 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -464,7 +464,11 @@  static void fill_engine_enable_masks(struct intel_gt *gt,
 }
 
 #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
-#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
+#define XEHP_LR_HW_CONTEXT_SIZE (96 * sizeof(u32))
+#define LR_HW_CONTEXT_SZ(i915) (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50) ? \
+				    XEHP_LR_HW_CONTEXT_SIZE : \
+				    LR_HW_CONTEXT_SIZE)
+#define LRC_SKIP_SIZE(i915) (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SZ(i915))
 static int guc_prep_golden_context(struct intel_guc *guc)
 {
 	struct intel_gt *gt = guc_to_gt(guc);
@@ -525,7 +529,7 @@  static int guc_prep_golden_context(struct intel_guc *guc)
 		 * on all engines).
 		 */
 		ads_blob_write(guc, ads.eng_state_size[guc_class],
-			       real_size - LRC_SKIP_SIZE);
+			       real_size - LRC_SKIP_SIZE(gt->i915));
 		ads_blob_write(guc, ads.golden_context_lrca[guc_class],
 			       addr_ggtt);
 
@@ -599,7 +603,7 @@  static void guc_init_golden_context(struct intel_guc *guc)
 		}
 
 		GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
-			   real_size - LRC_SKIP_SIZE);
+			   real_size - LRC_SKIP_SIZE(gt->i915));
 		GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt);
 
 		addr_ggtt += alloc_size;