[2/4] drm/i915: Fix WaProgramMgsrForL3BankSpecificMmioReads
diff mbox series

Message ID 20190709210620.15805-3-tvrtko.ursulin@linux.intel.com
State New
Headers show
Series
  • MCR fixes
Related show

Commit Message

Tvrtko Ursulin July 9, 2019, 9:06 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Two issues in this code:

1.
fls() usage is incorrect causing off by one in subslice mask lookup,
which in other words means subslice mask of all zeroes is always used
(subslice mask of a slice which is not present, or even out of bounds
array access), rendering the checks in wa_init_mcr either futile or
random.

2.
Condition in WARN_ON is not correct. It is doing a bitwise and operation
between a positive (present subslices) and negative mask (disabled L3
banks).

This means that with corrected fls() usage the assert would always
incorrectly fail.

We can fix this by invereting the fuse bits in the check.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: fe864b76c2ab ("drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads")
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 26 ++++++++++-----------
 1 file changed, 13 insertions(+), 13 deletions(-)

Comments

Chris Wilson July 9, 2019, 9:11 p.m. UTC | #1
Quoting Tvrtko Ursulin (2019-07-09 22:06:18)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Two issues in this code:
> 
> 1.
> fls() usage is incorrect causing off by one in subslice mask lookup,
> which in other words means subslice mask of all zeroes is always used
> (subslice mask of a slice which is not present, or even out of bounds
> array access), rendering the checks in wa_init_mcr either futile or
> random.
> 
> 2.
> Condition in WARN_ON is not correct. It is doing a bitwise and operation
> between a positive (present subslices) and negative mask (disabled L3
> banks).
> 
> This means that with corrected fls() usage the assert would always
> incorrectly fail.
> 
> We can fix this by invereting the fuse bits in the check.

s/invereting/inverting/

> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Fixes: fe864b76c2ab ("drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads")
> ---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 26 ++++++++++-----------
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 9e069286d3ce..b5f19ad48d22 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -776,26 +776,26 @@ wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
>          * something more complex that requires checking the range of every
>          * MMIO read).
>          */
> -       if (INTEL_GEN(i915) >= 10 &&
> -           is_power_of_2(sseu->slice_mask)) {
> +       if (INTEL_GEN(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
>                 /*
> -                * read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
> -                * enabled subslice, no need to redirect MCR packet
> +                * Read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
> +                * enabled subslice, no need to redirect MCR packet.
>                  */
> -               u32 slice = fls(sseu->slice_mask);
> -               u32 fuse3 =
> -                       intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
> -               u8 ss_mask = sseu->subslice_mask[slice];
> +               unsigned int slice = fls(sseu->slice_mask) - 1;
> +               u8 ss, en, dis;
>  
> -               u8 enabled_mask = (ss_mask | ss_mask >>
> -                                  GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
> -               u8 disabled_mask = fuse3 & GEN10_L3BANK_MASK;
> +               GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
> +               ss = sseu->subslice_mask[slice];
> +
> +               en = (ss | ss >> GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
> +               dis = intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3) &
> +                     GEN10_L3BANK_MASK;

Ok.

>                 /*
>                  * Production silicon should have matched L3Bank and
> -                * subslice enabled
> +                * subslice enabled.
>                  */
> -               WARN_ON((enabled_mask & disabled_mask) != enabled_mask);
> +               WARN_ON((en & ~dis) != en);

That certainly makes more sense. I always feared that was some deep
magic to reflect the underlying HW.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Tvrtko Ursulin July 11, 2019, 9:15 a.m. UTC | #2
On 09/07/2019 22:11, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-07-09 22:06:18)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Two issues in this code:
>>
>> 1.
>> fls() usage is incorrect causing off by one in subslice mask lookup,
>> which in other words means subslice mask of all zeroes is always used
>> (subslice mask of a slice which is not present, or even out of bounds
>> array access), rendering the checks in wa_init_mcr either futile or
>> random.
>>
>> 2.
>> Condition in WARN_ON is not correct. It is doing a bitwise and operation
>> between a positive (present subslices) and negative mask (disabled L3
>> banks).
>>
>> This means that with corrected fls() usage the assert would always
>> incorrectly fail.
>>
>> We can fix this by invereting the fuse bits in the check.
> 
> s/invereting/inverting/
> 
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Fixes: fe864b76c2ab ("drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads")
>> ---
>>   drivers/gpu/drm/i915/gt/intel_workarounds.c | 26 ++++++++++-----------
>>   1 file changed, 13 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> index 9e069286d3ce..b5f19ad48d22 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> @@ -776,26 +776,26 @@ wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
>>           * something more complex that requires checking the range of every
>>           * MMIO read).
>>           */
>> -       if (INTEL_GEN(i915) >= 10 &&
>> -           is_power_of_2(sseu->slice_mask)) {
>> +       if (INTEL_GEN(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
>>                  /*
>> -                * read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
>> -                * enabled subslice, no need to redirect MCR packet
>> +                * Read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
>> +                * enabled subslice, no need to redirect MCR packet.
>>                   */
>> -               u32 slice = fls(sseu->slice_mask);
>> -               u32 fuse3 =
>> -                       intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
>> -               u8 ss_mask = sseu->subslice_mask[slice];
>> +               unsigned int slice = fls(sseu->slice_mask) - 1;
>> +               u8 ss, en, dis;
>>   
>> -               u8 enabled_mask = (ss_mask | ss_mask >>
>> -                                  GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
>> -               u8 disabled_mask = fuse3 & GEN10_L3BANK_MASK;
>> +               GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
>> +               ss = sseu->subslice_mask[slice];
>> +
>> +               en = (ss | ss >> GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
>> +               dis = intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3) &
>> +                     GEN10_L3BANK_MASK;
> 
> Ok.
> 
>>                  /*
>>                   * Production silicon should have matched L3Bank and
>> -                * subslice enabled
>> +                * subslice enabled.
>>                   */
>> -               WARN_ON((enabled_mask & disabled_mask) != enabled_mask);
>> +               WARN_ON((en & ~dis) != en);
> 
> That certainly makes more sense. I always feared that was some deep
> magic to reflect the underlying HW.

It could still be magic and I am not confident to merge this. Well.. 
maybe.. It does fix out of bounds access. But it makes the WARN_ON fire 
on a SKU with subslice mask of 0b1100 1111 and L3 fuse bits of 0xb0100. 
Is that better or worse than a WARN_ON which never fires? :)

According to the docs these l3 fuse bits mean L3 banks 2 & 6 are not 
present. But what is the relationship with subslices 4 & 5 I couldn't 
find info on. I'll keep trying.

I also chatted with Lionel about this, since he typically knows GPU 
innards quite well, but in this case he also wasn't sure of this code.

> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Thanks!

Regards,

Tvrtko

Patch
diff mbox series

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 9e069286d3ce..b5f19ad48d22 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -776,26 +776,26 @@  wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 	 * something more complex that requires checking the range of every
 	 * MMIO read).
 	 */
-	if (INTEL_GEN(i915) >= 10 &&
-	    is_power_of_2(sseu->slice_mask)) {
+	if (INTEL_GEN(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
 		/*
-		 * read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
-		 * enabled subslice, no need to redirect MCR packet
+		 * Read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
+		 * enabled subslice, no need to redirect MCR packet.
 		 */
-		u32 slice = fls(sseu->slice_mask);
-		u32 fuse3 =
-			intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
-		u8 ss_mask = sseu->subslice_mask[slice];
+		unsigned int slice = fls(sseu->slice_mask) - 1;
+		u8 ss, en, dis;
 
-		u8 enabled_mask = (ss_mask | ss_mask >>
-				   GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
-		u8 disabled_mask = fuse3 & GEN10_L3BANK_MASK;
+		GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
+		ss = sseu->subslice_mask[slice];
+
+		en = (ss | ss >> GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
+		dis = intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3) &
+		      GEN10_L3BANK_MASK;
 
 		/*
 		 * Production silicon should have matched L3Bank and
-		 * subslice enabled
+		 * subslice enabled.
 		 */
-		WARN_ON((enabled_mask & disabled_mask) != enabled_mask);
+		WARN_ON((en & ~dis) != en);
 	}
 
 	if (INTEL_GEN(i915) >= 11)